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NOVEL SURFACE EXPOSED PROTEINS FROM CHLAMYDIA PNEUMONIAE 

The present invention relates to the identification of 
members of a gene family from the human respiratory pathogen 
Chlatmydia pneumoniae, encoding surface exposed membrane 
proteins of a size of approximately 89-101 kDa and of 56-57 
kDa, preferably about 89.6-100.3 kDa and about 56.1 kDa. The 
invention relates to the novel DNA sequences, the deduced 
amino acid sequences of the corresponding proteins and the 
use of the DNA sequences and the proteins in diagnosis of 
infections caused by C. pneumoniae, in pathology, in 
epidemiology, and as vaccine components. 

GENERAL BACKGROUND 

C. pneumoniae is an obligate intracellular bacteria 
(Christiansen and Birkelund (1992); Grayston et al . (1986)). 
It has a cell wall structure as Gram negative bacteria with 
an outer membrane, a periplasmic space, and a cytoplasmic 
membrane. It is possible to purify the outer membrane from 
Gram negative bacteria with the detergent sarkosyl . This 
fraction is named the 'outer membrane complex (OMC) ' (Caldwell 
et al. (1981)). The COMC (Chlamydia outer membrane complex) 
of C. pneumoniae contains four groups of proteins: A high 
molecular weight protein 98 kDa as determined by SDS-PAGE, a 
double band of the cysteine rich outer membrane protein 2 
(Omp2) protein of 62/60 kDa, the major outer membrane protein 
25 (MOMP) of 38 kDa, and the low-molecular weight lipo-protein 
Omp3 of 12 kDa. The Omp2/Omp3 and MOMP proteins are present 
in COMC from all Chlamydia species, and these genes have been 
cloned from both C. trachomatis, C. psittaci and C. 
pneumoniae. However, the gene encoding 98 kDa protein from C. 
pneumoniae COMC have not been characterized or cloned. 
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The 



current state of C. pneumoniae serology and detection 



C. pncumoniao ic an ohiigatp j nt.ra-cellular bacteria 

belonging to the genus Chlamydia which can be divided into 
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four species: c. trachomatis, c. pne^on.ae, c. psictaci and 
C.pecoru^. common for the four species is their obligate 
intra cellular growth, and that they have a biphasic life 
cycle, with an extracellular infectious particle (the 
elementary body, EB, , and an intercellular replicating form 
(the reticulate body, RB, . m addition the Chlamydia species 
are characterized by a common lipopolysaccharide (LPS) 
epitope that is highly immunogenic in human infection c 
trachomatis is causing the human ocular infection (trachoma, 
and genital infections, c. psittaci is a variable group of 
animal pathogens where the avian strains can occasionally 
infect humans and give rise to a severe pneumonia 
(ornithosis, . The first c. pneumoniae isolate was obtained 
from an eye infection, but it was classified as a non-typable 
Chlamydia under an epidemic outbreak of pneumonia in pITland 

the Chlamydia genus specific test, (the lygranum test,, and 
the patients showed a titre increase to. the untyped- chlamydia 
isolates, similar isolates were obtained in an outbreaK of 
upper respiratory tract infections in Seattle, and the 
Chlamydia isolates were classified as a new species 
Chlamydia pneumoniae (Grayston et al . (1989),. m addition 
c pneumoniae is suggested to be involved in the development 
Of atherosclerotic lesions and for initiating bronchial 
asthma (Kuo et al . (1995) ). These two conditions are thought 
to be caused by either chronic infections, by a 
hypersensitivity reaction, or both. 

Diagnosis oe Chlamydia pneumoniae infections 

Diagnosis of acute respiratory tract infection with C 
pneumoniae is difficult. Cultivation of c. pneumoniae' from 
patient samples is insensitive, even when proper tissue 
culture cells are selected for the isolation. . c. pneumoniae 
specific polymerase chain reaction (PCR, has been developed 
by Campbell et al . (1992) . 
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Even though Chlamydia pneumoniae has in several studies been 
detected by this PCR it is debated whether this method is 
suitable for detection under all clinical situations. The 
reason for this is, that the cells carrying Chlamydia 
5 pneumoniae in acute respiratory infections have not been 

determined, and that a chronic carrier state is expected but 
it is unknown in which organs and cells they are present. 
Furthermore, the PCR test is difficult to perform due to the 
low yield of these bacteria and due to the presence of 
10 inhibitory substances in the patient samples. Therefore, it 
will be of great value to develop sensitive and specific 
sero-diagnostics for detecting both acute and chronic 
infections. Sero-diagnosis of Chlamydia infections is 
currently based on either genus specific tests as the 
15 Lygranum test and ELISA, measuring the antibodies to LPS, or 
the more species specific tests where antibodies to purified 
EEs are measured by microimmuno fluorescence (Micro- IF) (Wang 
et al. (1970)) . However, the micro-IF method is read by 
microscopy, and in order to ensure correct readings the 
20 result must be compared to the results with C. trachomatis 
used as antigen due to the cross -react ing antibodies to the 
common LPS epitope. Thus, there exists in the art an urgent 
need for development of reliable methods for species specific 
diagnosis of Chlamydia pneumoniae, as has been expressed in 
25 Kuo et al. (1995); "..a rapid reliable laboratory test of 

infection for the clinical laboratory is a major need in the 
field". Furthermore, the possible involvement of C. 
pneumoniae in atherosclerosis and bronchial asthma clearly 
warrants the development of an effective vaccine. 

3 0 DETAILED DISCLOSURE OF THE INVENTION 

The present invention aims at providing means for efficient 
diagnosis of infections with Chlamydia pneumoniae as well as 
the development of effective vaccines against infection with 
this microorganism. The invention thus relates to species 
^c; c^p^r^-if-ir diagnostic tests for infection in a mammal, such as 
a human, with Chlamydia pneumoniae, said tests being based on 
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the detection of antibodies against surface expo-^d membrane 
proterns of a size of approximately 89-101 kDa and of 56-57 
kDa, preferably of about 89.6-100.3 kDa and about 56 1 kDa 
(the range in size of the deduced amino acid sequences was 
from 100.3 to 89.6 except for Ompl3 with the size of 56 1 
kDa) , or the detection of nucleic acid fragments encoding 
such proteins or variants or subsequences thereof. The 
invention further relates to the amino acid sequences of 
proteins according to the invention, to variants and 
subsequences thereof, and to nucleic acid fragments encoding 
these proteins or variants or subsequences thereof The 
present invention further relates to antibodies against 
proteins according to the invention. The invention also 
relates to the use of nucleic acid fragments and proteins 
according to the invention in diagnosis of Chlamydia 
pneumoniae and vaccines against Chlamydia pneumoniae. 

Prior to the disclosure of the present invention only a very 
limxted number of genes from C. pneumoniae had been 
sequenced. These were primarily the genes encoding known C 
trachomatis homologues : MOMP, Omp2 , Omp3 , Kdo- transferase , 
the heat shock protein genes GroEl/Es and DnaK, a 
ribonuclease P homologue and a gene encoding a 76 kDa protein 
of unknown function. The reason why so few genes have been 
cloned to date is the very low yield of C. pneumoniae which 
can be obtained after purification from the host cells. After 
such purification the DNA must be purified from the EBs, and 
at this step the C. pneumoniae DNA can easily be contaminated 
wxth host cell DNA. In addition to these inherent 
difficulties, it is exceedingly difficult to cultivate C 
pneun^oniae and use DNA technology to produce expression 
libraries with very low amounts (few ^g) of DNA. it has been 
known since 1993 (Melgosa et al . , 1993) that a 98 kDa protein 
xs present in OMC from C. pneumoniae. Even though the protein 
bands of 98 kDa was mentioned to be part of the OMC of C 
pneumoniae by Melgosa, the gene sequences and thus the 
deduced amino acid sequences have not been determined. Only 
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bands originating from Chlamydia pneumoniae proteins in 
general separated by SDS-PAGE are describe therein. 
However, the gene encoding this protein has not been 
determined before the present invention. Only a very weak or 
5 no reaction with patient sera can be observed to the 98 kDa 
protein (Campbell et al . 1990). and prior to the work of the 
present inventors it has not been recognized that the 89-101 
kDa proteins are surface exposed or that they in fact is 
immunogenic. In this report it is described that a number of 
10 human serum samples reacts with a C. pneuxnoniae protern that 
in SDS-PAGE migrate as 98 kDa. The protein was not further 
characterized and it is therefore not in conflict with the 
present application. 

Halme et al (1997) described the presence of human T-cell 
15 epitopes in C. pneumoniae proteins of 92-98 kDa. The proteins 
were eluted from SDS-PAGE of total chlamydia proteins but the 
identity of the proteins were not determined. 

use of antibodies to screen expression libraries is a well 
known method to clone fragments of genes encoding antrgenxc 
20 parts of proteins. However, since patient sera do not show a 
significant reaction with the 98 kDa protein it has not been 
possible to use patient serum to clone the proteins. 

It was known that monoclonal antibodies generated by the 
inventors reacted with conformational epitopes on the surface 
of C. pneumoniae and that they also reacted with C. 
pneumoniae OMC by immuno- electron microscopy (Christiansen et 
al 1994). Furthermore, the 9 8 kDa protein is the only 
unknown protein from the C. pneumoniae OMC (Melgosa et al . 
1993) The present inventors chose to take an unconventional 
step in order to clone the gene encoding the hitherto unknown 
98 kDa protein: C. pneumoniae OMC was purified and the highly 
immunogenic conformational epitopes were destroyed by SDS- 
treatment of the antigen before immunization. Thereby an 

3D antibody (PAB ISO) fn 1 f?ss immunogenic linear epitopes was 

obtained. This provided the possibility to obtain an 
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antiserum which could detect fho ^ . • 

that a gene familv . h P^^texn, and xt-was shown 

ycue ramiiy encodmo the i m 

blotting of recombinant E. coli. colony 

5 Mice infected with c. pz,eun,oniae generate antihoH" 

proteins identified bv • generate antibodies to the 

. "^iried by the inventors and named Omp4-l5 hut 

do not recognize the SDS treated ^ 

normally used for SDS-PAGE . H ^-^^-ed antigens 

™^ ^"^ immunoblotting. However 
strong reaction was seen if hho . • However, a 

I denat„r-.H • antigen was not heat 

denatured, it is therefore highly likelv ^h.^ 

:::::::: ::r — -L^Lt::,:.::::- 

tests and may very likely be used . 
accme for the prevention of infections. 

By generating antibodies against COMC from C on. 
polyclonal antibody (pab 150) k P^eu:„oniae a 

all the proteins Li! . ^ ^"'""^ "^'^^ ^^^^^^^ -ith 

procems. This antibody was used to identify, t-h 
genes encoding the 89.6-101 3 kDa anH '° ^^^"'^^^^^ 

:::::::: ^ - 

^omprxsing a number of simnl^r- r.^^^ 

^-^^^ genes were found in r 
pneumoniae. Therefore ^ i^ 

were .e<,,.ea .o L / L^rrH: T 

:::::::: - ^ -rn:ir;r:: 

epitopes positioned on diff^r-or.*- w -"-i-i-erenc 

for more members of th! ^^^itional genes and to search 

members of the gene family long range PCR wi^h 
primers derived fv-^.« ^v, ^ a ^cnye ft_K with 

^-Lt-uaced m two gene 
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Clusters: Ompl2 , 11 , 10 , 5 , 4 , 13 and 14 in one cluster and 
Omp6,7,8,9 and 15 in the second. Full sequence was obtained 
from Omp4,5,6,7,8,9,10,ll and 13, and partial sequence of 
Ompl2,14. Ompl3 was a truncated gene of 154 5 nucleotides. The 
5 rest of the full length genes were from 2526 (0inp7) to 2838 
(Ompl5) nucleotides. The deduced amino acid sequences 
revealed putative polypeptides of 89.6 to 100.3 kDa, except 
for Ompl3 of 56.1 kDa. Alignment of the deduced amino acid 
sequences showed a maximum identity of 49% (Omp5/Omp9) when 
10 all the sequences were compared. Except for Ompl3 , the lowest 
homology was to Omp7 with no more than 34% identity to any of 
the other amino acid sequences. The scores for Ompl3 was from 
29-32% to all the other sequences. 

In the present context SEQ ID Nos . 1 and 2 correspond to 
15 Omp4, SEQ ID Nos 3 and 4 correspond to Omp5, SEQ ID Nos 5 and 
6 correspond to Omp6 , SEQ ID Nos 7 and 8 correspond to Omp7, 
SEQ ID Nos 9 and 10 correspond to OmpB , SEQ ID Nos 11 and 12 
correspond to Omp9, SEQ ID Nos 13 and 14 corresponds to 
OmplO, SEQ ID Nos 15 and 16 corresponds to Ompll , SEQ ID Nos 
20 17 and 18 corresponds to Ompl2, SEQ ID Nos 19 and 20 

corresponds to Ompl3, SEQ ID Nos 21 and 22 corresponds to 
Ompl4, and SEQ ID Nos 23 and 24 corresponds to OmplS . 

The estimated size of the Omp proteins of the of the present 
invention are listed in the following. Omp 4 has a size of 

25 98.9 kDa, Omp5 has an estimated size of 97.2 kDa, Omp6 has an 
estimated size of 100.3 kDa, Omp7 has an estimated size of 
89.7 kDa, Omp8 has an estimated size of 90.0 kDa, Omp9 has an 
estimated size of 96.7 kDa, OmplO has an estimated size of 
9 8.4 kDa, Orapll has an estimated size of 97.6 kDa, Ompl3 has 

30 an estimated size of 56.1 kDa, Omp 12 and 14 being partial. 

Furthermore, SEQ ID No 2 5 is a subsequence of SEQ ID No 3 , 
SEQ ID NO 26 is a subsequence of SEQ ID No 4 , SEQ ID No 27 is 
a subsequence of SEQ ID No 5 , SEQ ID No 28 is a subsequence 

of SEQ ID Nn fi, SEP TP NQ 29 is a subsequence of SEQ ID No 7, 

35 and SEQ ID No 30 is a subsequence of SEQ ID No 8 . 
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Part OE the omp protexns ware expressed as fusion proterns 
and „.ee polyclonal monospecific ant.bod.es against the 
proterns were produced. The antibodies reacted with the 
surface o£ C. pneun^niae in both immunofluorescence and 

89-101 kDa and 56-57 kDa protein family in c. pneun,o„iae 
comprises surface exposed outer membrane proteins ThH 
rmportant finding leads to the realisation that members of 
.he 89-101 KOa and S6-5V .Oa c. p.e^oniae protein family are 
good candidates for the development of a sero diagnostic't s 
for c. pneumoniae, as well as the development of a vaccine 
agarnst infections with c. pneumoniae based on using these 
protexns. Furthermore, the proteins may be used as 
epidemiological markers, and polyclonal monospecific sera 
against the proteins can be used to detect c „„. ■ 
human tissue or detect c Pneumoniae in 

detect c. pneumoniae isolates in tissue 
culture^ Also, the genes encoding the 89-101 KDa and 56-57 

uLd'flr T'"°'-' SS.l. protein family may be 

telt bLed '""^ °' ^ =P-i"c diagnostic 

test based on nucleic acid detection/amplification. 

The full length Omp4 was cloned into an expression vector 
system that allowed expression of the Omp4 polypeptide This 
polypeptide was used as antigen for immuni3atiZ L a ^bb 

TtlLTj'T"'" ^""'"^'^ denaturing condition the 

antibody did not react with the native surface of c 
pneumoniae, but it reacted with a 98 kDa protein in' 
immunoblotting where purified c. pneimoniae EB was used as 

embldr^ -"'^"■^y —ted in paraffin 

embedded sections of lung tissue from experimentally infected 



sped ic d . ' '™ " a 3pecies 

specific diagnostic test for infection of a mammal, such as a 
human. With chlamydia pneumoniae, said test compri;i„g 
detecting in a patient or preferable in a patient sample the 
3S presence of antibodies against proteins from the outer 

membrane of chlamydia pneumoniae, said proteins being of a 
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mo 



lecular weight of 89-101 kDa or 56-57 kDa, or detecting the 
presence of nucleic acid fragments encoding said outer 
membrane proteins or fragments thereof. 

5 in the context of the present application, the term "patient 
sample" should be taken to mean an amount of serum from a 
patient, such as a human patient, or an amount of plasma from 
said patient, or an amount of mucosa from said patient, or an 
amount of tissue from said patient, or an amount of 
10 expectorate, forced sputum or a bronchial aspirate, an amount 
of urine from said patient, or an amount of cerebrospinal 
fluid from said patient, or an amount of atherosclerotic 
lesion from said patient, or an amount of mucosal swaps from 
said patient, or an amount of cells from a tissue culture 
15 originating from said patient, or an amount of material which 
in any way originates from said patient. The in vivo test in 
a human according to the present invention includes a skin 
test known in the art such as an intradermal test, e.g 
similar to a Mantaux test. In certain patients being very 
20 sensitive to the test, such as is often the case with 
children, he test could be non- invasive , such as a 
superficial test on the skin, e.g. by use of a plaster 

In the present context, the term 89-101 kDa protein means 
proteins normally present in the outer membrane of Chlamydia 
25 pneumoniae , which in SDS-PAGE can be observed as one or more 
bands with an apparent molecular weight substantially in the 
range of 89-101 kDa. From the deduced amino acid sequences 
the molecular size varies from 89.6 to 100.3 kDa. 

Within the scope of the present invention are species 
30 specific sero-diagnostic tests based on the usage of the 

genes belonging to the gene family disclosed in the present 
application . 

■ Preferred embodiments of the present invention relate to 

species ppoaific diagnostir tPSfR accoyding to the invention, 

3 5 wherein the outer membrane proteins have sequences selected 
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from the group consisting of SEQ ID NO: 2, SEQ ID NO- 4 SEO 
ID NO, 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: ,2. SEQ ID 
NO: 14, SEQ ID NO: 1., SEQ ID NO, 18, SEQ ID NO: 20, SEQ ID 

NO: 22, and SEQ ID NO: 24. 

When used in connection with proteins according to the 
present invention the term "variant" should be understood as 
a sequence of a.ino acids which shows a sequence similarity 
of less than 100% to one of the proteins of the invention A 
varrant sequence can be of the same size or it can be of a 
dxfferent sxze as the sequence it is compared to. a variant 
w.ll typically show a sequence similarity of preferably at 
least 50%, preferably at least 60%, more preferably at least 
70., such as at least 80%, e.g. at least 90%, 95% or 98%. 

The term "sequence similarity" in connection with sequences 
of proteins of the invention means the percentage of 
xdentxcal and conservatively changed amino acid residues 

wxth respect to both position and type) in the proteins of 
the invention and an aligned protein of equal of different 
length. The term "sequence identity" in connection with 
sequences of proteins of the invention means the percentage 
of Identical amino acid with respect to both position and - 
type xn the proteins of the invention and an aligned protein 
of equal of different length. 

Within the scope of the present invention are subsequences of 
one of the proteins of the invention, meaning a consecutive 
stretch of amino acid residues taken from SEQ id NO: 2, SEQ 
ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID 
NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID 
NO: 20, SEQ ID NO: 22 , or SEQ ID NO: 24. A subsequence will 
typically comprise at least 100 amino acids, preferably at 
least 80 amino acids, more preferably at least 70 amino 
acxds, such as 50 amino acids. It might even be as small as 
10-50 amino acids, such as 20-40 amino acids, e.g. about 30 
ammo acids. A subsequence will typically show a sequence 
homology of at least 50%, preferably at least 60%, more 
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preferably at least 70%, such as at least 80%, e.g. at least 
90%, 95% or 98%. 

Diagnostic tests according to the invention include 
immunoassays selected from the group consisting of a direct 
or indirect EIA such as an ELISA, an immunoblot technique 
such as a Western blot, a radio immuno assay, and any other 
non- enzyme linked antibody binding assay or procedure such as 
a fluorescence, agglutination or precipitation reaction, and 
nephelometry . 

A preferred embodiment of the present invention relates to 
species specific diagnostic tests according to the invention, 
said test comprising an ELISA, wherein antibodies against the 
proteins of the invention or fragments thereof are detected 
in samples . 

A preferred embodiment of the invention, is an ELISA based on 
detection in samples of antibodies against proteins of the 
invention. The ELISA may use proteins of the invention, or 
variants thereof, i.e. the antigen, as coating agent. An 
ELISA will typically be developed according to standard 
methods well known in the art, such as methods described in 
"Antibodies; a laboratory manual", Ed. David Lane Harlow, 
Cold spring Habor laboratories (1988), which is hereby 
incorporated by reference . 

Recombinant proteins will be produced using DNA sequences 
5 obtained essentially using methods described in the examples 
below. Such DNA sequences, comprising the entire coding 
region of each gene in the gene family of the invention, will 
be cloned into an expression vector from which the deduced 
protein sequence can be purified. The purified proteins will 
0 be analyzed for reactivity in ELISA using both monoclonal and 
polyclonal antibodies as well as sera from experimentally 
infected mice and human patient sera. 
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Pro™ the experimentally infe.ted ™ice sera it is-Known that 
non-linear epitopes are recognised predominantly. Thus it is 
contemplated that different forms of purification sc' »mes 
^nown in the art „iu be used to analyse for the presence of 
discontrnuous epitopes, and to analyze whether the human 
immune response is also directed against such epitopes. 

Preferred embodiments of the present invention relate to 
species specific diagnostic tests according to the invention 
Wherein the nucleic acid fragments have sequences selected 

rom the group consisting of SEQ ID NO: 1, SEQ ID NO- 3 SEO 
ID NO: S, SEQ ID NO: 7, SEQ ID «0: SEQ ID NO: 11 SEQ ID 

7o. II' - 

NO: 21, and SEQ ID NO: 23. 

In connection with nuolp>ii- ^r^-i^ * 

tn nucleic acxd fragments according to the 

present invention the ter. ..variant" should be understood as 
a sequence Of nucleic acids which shows. a sequence homology 

or ITITT 7'-." ^^^"^'^^^ — ^- 

or xt can be of a different size as the sequence it is 
compared to. A variant will typically show a sequence 
homology of at least 50%, preferably at least 60%, more 
preferably at least 70%, such as at least 80%, e.g. at least 
90%, 95% or 98%. ^ 

The term "sequence homology" in connection with nucleic acid 
fragments of the invention means the percentage of matching 
nucleic acids (with respect to both position and type) in the 
nucleic acid fragments of the invention and an aliped 
nucleic acid fragment of equal or different length. 

in order to obtain information concerning the general 
distribution of each of the genes according to the present 
invention, PCR will be performed for each gene on all 
available c. pneun^oniae isolates. This will provide 
information on the general variability of the genes or 
nucleic acid fragments of the invention. Variable regions 
will be sequenced. From patient samples PCR will be used to 
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amplify variable parts of the genes for epidemiology. Non- 
variable parts will be used for amplification by PCR and 
analyzed for possible use as a diagnostic test. It is 
contemplated that if variability is discovered, PCR of 
variable regions can be used for epidemiology. PCR of non- 
variable regions can be used as a species specific diagnostic 
test, using genes encoding proteins known to be invariable in 
all known isolates prepared as targets for PCR to genes 
encoding proteins with unknown function. 

Particularly preferred embodiments of the present invention, 
relate to diagnostic tests according to the invention, 
wherein detection of nucleic acid fragments is obtained by 
using nucleic acid amplification, preferably polymerase chain 
reaction (PCR) . 

15 Within the scope of the present invention is a PCR based test 
directed at detecting nucleic acid fragments of the invention 
or variants thereof. A PCR test will typically be developed 
according to methods well known in the art and will typically 
comprise a PCR test capable of detecting and differentiating 
20 between nucleic acid fragments of the invention. Preferred _ 
are quantitative competitive PCR tests or nested PCR tests. 
The PCR test according to the invention will typically be 
developed according to methods described in detail in EP B 
540 588, EP A 586 112, EP A 643 140 OR EP A 669 401, which 
25 are hereby incorporated by reference. 

Within the scope of the present invention are variants and 
subsequences of one of the nucleic acid fragments of the 
invention, meaning a consecutive stretch of nucleic acids 
taken from SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 
30 NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 
15, SEQ ID NO: 19, SEQ ID NO: 21, or SEQ ID NO: 23. A variant 
or' subsequence will preferably comprise at least 100 nucleic 
acids, preferably at least 80 nucleic acids, more preferably 

ac lea^L 70 nucleic aoido, cuch as ar Ipa . st. ^0 nucleic acids. 

35 It might even be as small as 10-50 nucleic acids, such as 
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20-40 nucleic acids, e.g. about 30 nucle.c acids- A 
subsequence „iX.l typ.caXl, s.ow a sequence ho.oio.y o. at 

east 30,, p.ete„bly at least .0., .ore preferably at least 
70«, such as at least- fins- ^ ^ ^ , J-easc 

' ^•5- least 90%, 95% or qfl^ tv. 

Accordingly, a subsequence of 100 nucleic acids or 
show a homology of at least 80% . 

A very important aspect of the present invention relates to 
proteins of the invention derived from chJaravdi. 

10 having a.i„o acid sequences selected .rZ ZtZ^' 
consisting of SEQ ID NO: 2, SEQ ID NO- 4 SEO Tn Z . 
- NO 3, SEO ID 10, SEO lO NO: I., SEf D ^ 14 '3^10 

NO: IS, SEQ ID NO: 18, SEQ ID „0: 20, SEQ ID NO- « ! 
IB NO: 24 having a sequence si.ilari y o^ t leas so^ 

IS preferably at least .0%, .ore preferably at iZTZ" H 
as at least 80%, e.g. at least 90% 95» or 98- / 
biological function. " 

By the term "similar biological function" is meant that the 
20 r^'r'""^ -aracteristics similar with the ZlTns 

senes range be len 4 "c'' ^"''^^^'^"^^ 

cween 43 55«. Comparison of the amino acid 

sequences of Omp4-ls shows 34-49. identity and S3-«% 
similarity. The homology is generally scattered alon; the 

30 IZV «™. a! seen 

Lmol ' ''"^ ^""^ "''-h the 

homology is more pronounced. This is seen in the repeated 

genes. It is interesting that the DNA homology is not 

.s ZTZV^" — ng the fouf amino Lids 

CGAI. This may indicate a functional role of this part of the 
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protein and indicates that the repeated structure did not 
occur by a duplication of the gene. In addition to the four 
amino acid repeats GGAI a region from amino acid 400 to 490 
has a higher degree of homology than the rest of the protein, 
5 with the conserved sequence FYDPI occurring in all sequences. 
As further indication of similarity in function the amino 
acid tryptophan (W) is perfectly conserved at 4-6 
localizations in the C-terminal part of the protein. 

Since none of the genes and deduced amino acid sequences of 
10 the invention are identical the following is within the scope 
of the present invention; production of monospecific 
antibodies, the use of said antibodies for characterizing 
which C. pneumoniae proteins are expressed, the use of said 
antibodies for characterizing at which time during 
15 developmental life cycle said C. pnetwnoniae proteins are 

expressed, and the use of said antibodies for characterizing 
the precise cellular localization of said C. pneumoniae 
proteins. Also within the scope of the present invention is 
the use of monospecific antibodies against proteins of the 
20 invention for determining which part of said proteins is 

surface exposed and how proteins in the C. pneumoniae COMC 
interact with each other. 

Preferred embodiments of the present invention relate to 
25 polypeptides which comprise subsequences of the proteins of 
the invention, said subsequences comprising the sequence 
GGAI . Further preferred embodiments of the present invention 
relate to polypeptides which comprise subsequences of the 
proteins of the invention, said subsequences comprising the 
3 0 sequence FSGE . 

Polypeptides according to the invention will typically be of 
a length of at least 6 amino acids, preferably at least 15 
amino acids, preferably at least 20 amino acids, preferably 
at least 25 amino acids, preferably at least 30 amino acids, 
35 preferably at least 35 amino acids, preferably at least 40 
amino acids, preferably at least 45 amino acids, preferably 
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at least 50 amino acids, preferably at least 55-amino acids, 
preferably at least lOO amino acids. 

A very important aspect of the present invention relates to 
nucleic acid fragments of the invention derived from 
5 Chlamydia pneumoniae, variants and subsequences thereof . 

Another important aspect of the present invention relates to 
antibodies against the proteins according to the invention, 
such antibodies including polyclonal monospecific antibodies 
and monoclonal antibodies against proteins with sequences 
selected from the group consisting of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 
12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 
20, SEQ ID NO: 22, and SEQ ID NO: 24. 
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A very important aspect of the present invention relates to 
diagnostic kits for the diagnosis of infection of a mammal, 
such as a human, with Chlamydia pneumoniae, said kits 
comprising one or more proteins with amino acid sequences 
selected from the group consisting of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO 
12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 
20, SEQ ID NO: 22, and SEQ ID NO: 24. 



Another very important aspect of the present invention 
relates to diagnostic kits for the diagnosis of infection of 
a mammal, such as a human, with Chlamydia pneumoniae, said 

25 kits comprising antibodies against a protein with an amino 
acid sequence selected from the group consisting of SEQ ID 
NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO: 22, and SEQ ID NO: 24. 

30 Antibodies included in a diagnostic kit according to the 
invention can be polyclonal or monoclonal or a mixture 
hereof . 
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^ t-h(=- nresent invention 
!l such as a human, with CHlar^ydia pneu^on.ae, said 
. sequences seXectea ..o. the I'^^^l^-^^Z.J 1 - 

ID NO: 19, SEQ ID NO : 21, and SEQ ID NO : 23. 

^ of the present invention relates to a composition 
An aspect of the prese ^ Chlan^ydia 

.0 for immunizxng a mammal such proteins 
^^■i::,^ «^aid composition comprising one 
pneumoniae, saia comp consisting 
.ith amino acid sequences selected from the g P 
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^ .^portant role tor the protein, o. the i-"tio„ in ^ 

—'z :::rr i: :re-n;:rtre Untion, 

rn:~A::i-3 ana .Uh3e.uen.e3 --^^ - --Ahen h. 

tVDioallY by using recombinant techniques, and will 

T, L Ltigen in immunization o£ mammals, such as 

:::bi Suhse !entlv, the hyper immune sera obtainea by the 
irnization Will be analyzed .or P-^-^/J-^ 
pneumoniae infection using a tissue cu ture assay^; ■ 
^^■^H„n it is contemplated that monoclonal antibodie 
"Tr ducea, typically using standard hybridoma technic.es, 
and analyzed £or protection against intectron wxth C. 
pneumoniae . 

> It is envisioned that particularly interesting and 
) It is envis found in connection with the 

immunogenic epitopes will ^J"^ subsequences 

proteins o£ the -----^f;/^^;,, polypeptides 
of said proteins. It ^= P"^^^^ ..pteins of the invention 
comprising suai bubsemucnoao fhP Ptoter 
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in immunizing a mammal, such a^ . h 

pneumoniae. ^^^^^sr-Chlawydia 

An important aspect of tho 

5 consisting of SEQ id NO- 2 SEO In Mn 

"0: «, SEO ID NO: 18 SEO No '° 

- in diagnosis oTi'ectio r^"" " 

ta^an, „ith cu^y^sa pne^oniae. ' ^= ^ 

10 A preferred embodiment „f 

use Of prote.„sl\oL "^"^ ^ 

undanatured for., in diagnosL If / " 

as a human, „ith CWa^dia pne^onia!"'"" " 

A very important aspect of th. 

IS the use of prctein/wlth invention relates to 

consistrn, of SEO lo" ^ T'^'^ 

- NO: a, SEO ID NO: 10, ^EO ID No ^2 ^s."" ^' 
NO: 16, SEO ID NO: la SEO In 1 ° '° ""^ ^^Q ID 

X° NO: f„ immuni^inTa ma ' 

0 Chia.3^dia pneumoniae. ^= ^ against 

A preferred embodiment of the. 

use Of proteins aooo d ing ZZT ^'^"^ ^ 

undenatured form fo^ , . invention in an 

against c«a.,dia pneji:::::""^ ^ ^ ^ 

A very important aspect of th= 

'he use Of nucleic add frao'e t""" ^^^^^^ '° 

selected from the group con^.'f' """^""^ 
NO: 3, SEO ID NO: S, SE^O ID ^^^^ s^o^iD ll 

11. SEQ ID NO: 13, SEQ ID NO- IS 'sEO in ™^ 
SEO ID NO: .1, and SEO ID ^o: .3 f ^ 
-ch as a human, against c..a„,.ia JZolZ""'"" ' 
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A t->.^t one type of vaccine against C. 
" - ^Z7Z\Tt:Zj^y usin. .ene-,un vaccination o. 
'"TZlcX. aifferen. genetic cons„.c.s conta.n.n. 
mice. Typicdxxy, ^ ■ r.f nucleic acid 

nucleic acid fragments, cc.b.nations o nucl . 

,„.™en.s acco.ain. to .e 
gene-gun approach. The mice .cellular immune 

,lvzed for production of both humoral and cellu 

analy^^*^ luj. f -infection wxth C. 

response and for protection agarnst .nfectx 

pneuxnoniae after challenge herewith. 

• -.h this the invention also relates to the uses of 
Xn ixne w.th ^J^^;^ \ .^^ ^ pharmaceutical (a vaccrne) 

the protexns of the inven preparation of a 

;,s to the uses thereof for the pr^f 
as well as to cn rhlamvdia pneumoniae, 

vaccine against infections wxth Chlamyd. 

f vaccines which contain protein sequences as 
preparation of vaccines understood in the art, 

, active ^^^^:i^Z^:^s,2S^^ .,^01,^03.. 

as -e.^^^^f^;7;3,";::,,,3.,. and 4,578.7.0. all incorpor- 
,,S99,23X; ^ ^^^^^ ^ ^^,,3,,,, 3uch vaccines are pre- 
ated herein by /^^^ solutions or suspen- 

pared as injectables either as 3Uspension 
20 sions; solid forms suitable for ^p^.ed. The 

in. liquid prior to .^^Z';;^ immunogenic 

preparation may also be ^^^^^ .re pharma- 

ingredient is often mixed with --P^ ^.^^ ,,g,edi- 

ceutically --P--:;;;rTo: saline. . 

25 ent. suitable -cxp.ents a ^^^^^ ....^nations 

dextrose, glycerol, ^^^^^^^ ^3,,,,^ may contain 

thereof, m addition, if desired, 

r^f auxiliary substances such as we ^ 
minor amounts of auxiiia y adjuvants which 

. -rnrr aaents pH buffering agents, or aa^ 

emulsifying agents, y 
30 enhance the effectiveness of the vaccrnes. 

„ea are conventionally administered parenterally, by 
The vaccrnas are con ,,bcutaneously or intramuscu- 

injection, for example, either suitable for other 

..r1y Additional ^^^^ Lt^^ and , .n .O.e 

"irfo— rr/e corpositrons ta.e the form of 
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"lease formulations or powders and ^^P^"^-" --Gained 

ingredient, preferably 2S.70- T ""'^^ -"ive 

carrier. optionally a suitable 



e as 
s are 



^ ^^^^^ :irr:-T"r 
-itoT a^dr:::: ^^™:er:r " 

^«-"ve and i^uno^enio^T^r '""^P-tically 
" depends on the subject to be trel":""'" '° administered 
«e =£ tbe order of several hunlred . ^ 
ingredient per vaccinatio "^''"^ "i^ograms active 

-accine further comprises an "'^^ enhanced if the 

- the art. other possLiUtre: '""^""^^ « 
i".unomodulating substances sLT ", °' 

and 1.-.., synthe: c l^i - — 

" ---"-n the aboye-men:irr::;u:::tV^ 

It is also possible to produce » , ■ ^ 
' =ing, into a non-pathogenic mLr '"^ 

n-leic acid fragment enc'dmra ' " 

the invention, and effectL, '"'"""^ " ^-'-n 

fragment or the protein on th! °^ '^'^^ P-tein 

'-3. in the form of a fusion o^t " °' ---organism 
anchoring part or in the fo™ IT " ""'"^"^ ^ -"^"ne 
or protein fragment carrying a 1 ■ " . """^ ""''"'^'^ P'-^-i" 
anchoring in the membrane, . The stird"" ^^^"^ 
- adapt relevant expression syst": Lr'hT 

cms purpose. 

Another part of i-u 

«-nt research ha^:::::;:: t^ t'^^" 

vector Which is non-replicat -ve ' ^"'^ '"'"^"^ - a 

introduced into an animal in" ^" «11= -y be 

intramuscular injection or percutan' ' 

so-called ..gene gun., approa^hr Th ^^"-^"tion (the 

--le ceils and the Z of Uteres is 

nnerest is expressed by a 



10 



wo 98/58953 PCT/DK 98/002 66 

21 

promoter which is functioning in eukaryotes, e.'gT'a viral 
promoter, and the gene product thereafter stimulates the 
immune system. These newly discovered methods are reviewed in 
Ulmer et al . , 1993, which hereby is included by reference. 

Thus, a nucleic acid fragment encoding a protein or protein 
of the invention may be used for effecting in vivo expression 
of antigens, i.e, the nucleic acid fragments may be used in 
so-called DNA vaccines. Hence, the invention also relates to 
a vaccine comprising a nucleic acid fragment encoding a 
protein fragment or a protein of the invention, the vaccine 
effecting in ^i^fo expression of antigen by an mammal, such as 
a human, to whom the vaccine has been administered, the 
amount of expressed antigen being effective to confer 
substantially increased resistance to infections with 
15 Chlamydia pneumoniae in an mammal, such as a human. 

The efficacy of such a "DNA vaccine" can possibly be enhanced 
by administering the gene encoding the expression product 
together with a DNA fragment encoding a protein which has the 
capability of modulating an immune response. For instance, a 
gene encoding lymphokine precursors or lymphokines (e.g. IFN- 
y, IL-2, or IL-12) could be administered together with the ~ 
gene encoding the immunogenic protein fragment or protein, 
either by administering two separate DNA fragments or by 
administering both DNA fragments included in the same vector. 
25 It is also a possibility to administer DNA fragments compri- 
sing a multitude of nucleotide sequences which each encode 
relevant epitopes of the protein fragments and proteins 
disclosed herein so as to effect a continuous sensitization 
of the immune system with a broad spectrum of these epitopes. 

30 The following experimental non- limiting examples are intended 
to illustrate certain features and embodiments of the inven- 
tion . 
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LEGENDS TO FIGURES _ 

purified c. pneLm-oniae EB (A) and purified OMC (B) . 

Figure 2. The figure shows silver stained is% sds-page of 

purified EB and OMC han^ i ■ ^ ■ ^ 

2 r r. ■ ' P^^^f^^d c. pneumoniae EB; lane 

T . ' P"^ified c. trachomatis ER • 

lane 4 C. trachoma cis OMC. ' ^""^ 

Figure 3. The figure shows immunoblottinci of r nr, 

separated by io% SDS-PAGE tran.f I Pneumoniae EB 

reacted with rabbit antf; IZl "^^^ ^^"^ose and 

SDS P^CE of recombinant pEX that were detected by the rabbit 
anti C. pneumoniae serum. Arrow indicated th. i 
the 117 kn;, K 1 inaicated the localization of 

cne 117 kDa b-galactosidase protein. 



^ colony blotting separated by 7 5^ 
a . r:„:r;"""""^ ^° ni.rocei.niose and reac::d „ifh 

wei.:: da'd™"": z- t " ^^^"-^ - 

, - ^ °- ^^"^ 2-6 PEX clones cultivated at 42 •'C to 

rnduce the production of the .-ealactosidase fusion proteins. 

Omp5. Arrows indicates primers used for sequencing. 

are fount n Lste": ra^eT". ^"'^ " 

Cluster 2 are found Omp6 , 7, 8, 9, and 15. 

Figure 8 A - J. The figure shows alignment of r 
0mo4-T:; a-Liynmenc of C. pneumoniae 

0.P4 15, using the program pileup in the GCG package. 

Figure 9 . The figure shows immunofluorescence of c 
pneumoniae infected HeLa 72 hr-c 

ei.a, 72 hrs. after infection, reacted 
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with mouse monospecific anti -serum against pEX3-36 fusion 
protein. pEX3-36 is a part of the Omp5 gene. 

Figure 10. The figure shows immunoblotting of C. pneumoniae 
EB, lane 1-3 heated to 100°C in SDS-sample buffer, lane 4-6 
unheated. Lane 1 reacted with rabbit anti C. pneumoniae OMC; 
lane 2 and 4 pre -serum; lane 3 and 5 polyclonal rabbit anti 
pEXl-1 fusion protein; lane 6 MAb 26.1. 

Figure 11. The figure shows immunoblotting of C. pneujnoniae 
EB, lane 1-4 heated to lOOoC in SDS-sample buffer, lane 5-6 
unheated. Reacted with serum from C57 -black mice 14 days 
after infection with 10^ CFU of C. pneumoniae. Lane 1 and 5 
mouse 1; lane 2 and 6 mouse 2; lane 3 and 5 mouse 3; and lane 
4 and 8 mouse 4 . 

Figure 12. The figure shows immunohistochemistry analysis of 
mouse lung tissue with C. pneumoniae inclusions present both 
in the bronchial epithelium and in the lung parenchyma 
(arrows) . 
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Cloning of the genes encoding the 98/95 kDa C. pneumanla^ 
COMC proteins 

Purification of c. pneumonia EBs and COMC 

C. pneusnonlae was cultivated in HeLa cells. Cultivation was 
done according to the specifications of Miyashita and 
Matsumoto (1992), with the modification that centrif ugation 
of supernatant and of the later precipitate and turbid bottom 
layer was carried out at 100,000 X g. The microorganism 
attached to the HeLa cells by 30 minutes of centrif ugation at 
1000 X g, after which the cells were xncubated in RPMI 1640 
^edxum (Gibco BRL, Germany cat No. 51800-27), containing 5% 
foetal calf serum (fcs, Gibco BRL, Germany Cat No. 10106 169) 
gentamicin for two hours at 3 70c xn 5% C02 atmosphere The 
medium was changed to medium that in addition contained 1 mg 
per ml of cycloheximide . After 4 8 hours of incubation a 

coverslip was removed from the cultur-^c =,r,^ 

cultures and the inclusion was 
tested wxth an antibody snecif-i n fr^- r- 

^ specirxc for C. pneumoniae (MAb 26 1) 

Christiansen et al . 1994) and a monoclonal antibody specific 
for the species C. trachonratis (MAb 3 2.3, Loke diagnostics- 
Arhus Denmark) to ensure that no contamination with C 
trachomatis had occurred. The HeLa cells were tested by 
Hoechst stain for Mycoplasma contamination as well as by 
culture in BEa and BEg medium (Freund et al . , 1979). Also the 
C. pneumoniae stocks were also tested for Mycoplasma 
contamination by cultivation in BEa and BEg medium No 
contamination with. C. trachomatis, Mycoplasmas or bacteria 
were detected in cultures or cells. 72 hours post-infection 
the monolayer was washed in PBS, the cells were loosened in 

PBS with a rubber policeman AnH t-h^ ^■ 

H^j-j-ceman, and the Chlamydia were liberated 

from the host cell by sonication. The C. pneu^noniae EBs and 

RBs were purified on discontinuous density gradients 

(Miyashita et al . (i992,). The purity of the Chlamydia EBs 

were verified by negative staining and electronmicroscopy 

(Fxgure 1), only particles of a size of 0.3 to 0.5 mm were 
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detected in agreement with the structure of C. pneumonia EBs . 
The purified Chlamydia EBs were subjected to sarkosyl 
extraction as described by Caldwell et al (1981) with the 
modification that a brief sonication was used to suspend the 
COMC. The purified COMC was tested by electronmicroscopy and 
negative staining (Figure 1), where a folded outer membrane 
complex was seen. 



SDS 



■PAGE analysis of purified EBs and COMC 



10 



The proteins from purified EBs and C. pneumoniae OMC were 
separated on 15% SDS-polyacrylamide gel, and the gel was 
silver stained (Figure 2) , in lane 1 it is seen that the 
purified EBs contain major proteins of 100/95 kDa and a 
protein of 38 kDa, in the purified COMC (lane 2) these two 
protein groups are also dominant. In addition, proteins with 
15 a molecular weight of 62/60 kDa, 55 kDa, and 12 kDa have been 
enriched in the COMC preparation. When the purified C. 
pneumoniae EBs are compared to purified C. trachomatis EB 
(lane 3) it is seen that predominant protein in the C. 
traci70inatis EB is the tpajor outer membrane protein (MOMP) , 
and it is also the dominant band in the COMC preparation of 
C. trachomatis (lane 4), and Omp2 of 60/62 kDa as well as 
Omp3 at 12 kDa are seen in the preparation. However, no major 
bands with a size of 100/95 kDa are detected as in the C. 
pneumoniae COMC preparation. 
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25 
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Production of rabbit polyclonal antibodies against C. 
pneumoniae COMC 

To ensure production of rabbit antibodies that would 
recognize all the C. pneumoniae proteins in immuno-blotting 
and colony-blotting 10 fxg of COMC antigen was dissolved in 20 
^il of SDS sample buffer and thereafter divided into 5 vials . 
The dissolved antigen was further diluted in one ml of PBS 
and one ml of Freund incomplete adjuvant (Difco laboratories, 

UbA cat. Nu. 0639 - 00 G) and injected into the quadriceps 

muscle of a New Zealand white rabbit. The rabbit was given 
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three times intramuscular iniectionc ■ 

week, and after further three ^ °' 
o.^t- • ... ^"^ther three weeks the dissolved COMc 
protein, diluted in one ml pbq w.. . • . 
^r.H injected intravenouslv 

and the procedure was repeated two weeks later ^, 
> after the beginnino of t-h^ • • ^""^^ ""^^^^ 

oh^n• . . 3^""i"9 of the immunization, the serum was 
obtained from the rabbit. Purified r r> "as 
senar;,^o^ k or.^ ^^uritied c. pneumoniae EBs were 

separated by SDS-PAGE, and the proteins were 
electrotransf erred to nitrocell m r,c.^ 

was blocked and • "^•^^^^^ll^lose membrane. The membrane 

as blocked and immunostained with the polyclonal COMr 
antibody (Fiqure 3) tHo i^^-^ycxonal COMC 

y ^ figure 3) . The serum recognized proteins with . 
sxze Of 100/95, 60 and 38 kDa in the EB orL T 
in agreement with the sizes of P^^P^-^tion. This is 

the sizes of the outer membrane proteins. 

Cloning of the COMC proteins 

Due to the cultivation of c. pn.u^onl.. in HeLa cells 
contaminating host cell D.. could be present in the EB 
preparations. Therefore hh» ^ ^ 

treated with DNAsf He. P-paratxons were 

cn uNAse to remove contaminating DHA The r 
pneumoniae D«A was then purified by Cscl gradient 

si« of ano ^ containing DNA fragments with a 

::::or3 rLTpe lerh:^;: r 

.3.. .he rjr:;:::,::/- 

Of the S-galactos.dase gene. Expression of the gene is 

::tdtr:;:L:a\~^; ^° ~- 

co.onies of Lomhinr :L%e~:" ^ ' 

nitrocellulose membranes and hho ^ 

to 4 20C for K temperature was increased 

nitrocelluLsr T' '^^'^^'^ ^^^^ '^^^^ P^-^"- 

ii-Lcroceiiulose membranes on filter-Q . • 

j-iicers soaked m 5% qnc; thq 

CO on.es expressing outer memhrane proteins wer de ec e 
v-UMC. The positive clones were cul t i t.^i-^.^ • 

induced at 420c for two h ^"^^ivated m suspension and 

.1,. ° protein profile of the 

clones were analysed by SDS-PAGE ^nH ■ 

y bDS PAGE, and increases in the size 
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of the induced b-galactosidase were- observed (Figure 4). In 
addition, the proteins were electrotransf erred to 
nitrocellulose membranes, and the reaction with the 
polyclonal serum against COMC was confirmed (Figure 5) . 



5 sequencing of positive COMC clones 

TO characterize the pEX clones, the inserted C. pneumoniae 
DNA was sequenced. The resulting DNA sequences were searched 
against the prokaryotic sequences in the GenEmbl database. 
The search identified 6 clones as part of the Omp2 gene, and 
0 2 clones as part of the Omp3 gene, and 2 clones as part of 
the MOMP gene, indicating that COMC proteins had been 
successfully cloned. Furthermore, 32 clones were obtained, 
containing DNA sequences not found in the GenEmbl database. 
These sequences could, however, be clustered in two contrcs 
15 of 6 and 4 clones, and three clones were identical. In 

addition 19 clones were found with no overlap to the contxcs 
(Figure 7) . To obtain more sequence data for the genes, C. 
pneumoniae DNA was totally digested with BamHI restriction 
enzyme, and the fragments were cloned into the vector 
20 pBluescript. The ligated DNA was electrotransf ormed into E._ 
coli XLl-Blue and selected on plates containing Ampicillm. 
The recombinant bacterial colonies were transferred to a 
nitrocellulose membrane, and colony hybridisation was 
performed using the inserts of pEX 1-1 clone as a probe A 
25 clone containing a single BamHI fragment of 4 . 5 Icb was found, 
and the hybridisation to the probe was confirmed by Southern 
blotting. The insert of the clone was sequenced 
bi-directionally using synthetic primers for approx. each 300 
bp The sequence of the BamHI fragment made it possible to 
30 join the two contics of pEX clones. Totally, together with 
the pEX clones it was possible to assemble 6 . 5 kb DNA 
sequence, encoding two new COMC proteins. (Figure 6) 

Additional sequences were obtained by PCR performed on 

purified C. pneumoniae UNA with pilmera both from fhp known 

35 Omp genes and from other known genes. The obtained PCR 
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products were sequenced, The sequence organisation is shown 
in Fig. 7. Additional 8 Omp genes were detected. The 
alignment of the deduced amino acid sequences are shown in 
Fig . 8 A and B . 

5 Analysis of DNA sequence 

The DNA sequence encoding the Omp4-15 proteins with a size of 
89.6-100.3 kDa (and for Ompl3 : 56.1 kDa) . Omp4 and Omp5 were 
transcribed in opposite directions. Downstream Omp4 a 
possible termination structure was located. The 3 ' end of the 
10 Omp5 gene was not cloned due to the presence of the BamHI 
restriction enzyme site positioned within the gene. The 
translated DNA sequence of Omp4 and Omp5 was compared by use 
of the gap programme in the GCG package (Wisconsin package, 
version 8.1-UNIX, August 1995, sequence analysis software ' 
15 package) . The two genes had an amino acid identity of 41% 
(similarity 61%), and a possible cleavage site for signal 
peptidase 1 was present at amino acid 17 in Omp4 and amino 
acid 2 5 in Omp5. When the amino acid sequence encoded by two 
other pEX clones were compared to the sequence of Omp4 and 
Omp5 they also had amino acid homology to the genes, it is 
seen that the two clones have homology to the same area in' 
the Omp4 and Omp5 proteins. Consequently, the pEX clones must 
have originated from two additional genes. Therefore these 
genes were named Omp6 and Omp7 . Similar analyses were 
25 performed with the other genes. In contrast to what was seen 
for Omp4 and 5 none of the other putative omp proteins had a 
cleavage site for signal peptides. 

EXAMPLE 2 

Polyclonal monospecific antibodies against pEX fusion 
3 0 proteins and full length recombination + Omp4 

To investigate the topology of the Omp4-7 proteins, 
representative pEX clones, were selected from each gene. The 
fusion proteins of ^-galactosidase/omp were induced, and the 



20 



10 
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proteins were partially purified as inclusion bodies. Balb/c 
mice were immunized three times intramuscular with the 
antigens at an interval of one week, and after six weeks the 
serum was obtained from the mice, HeLa cells were infected 
with the C. pneumoniae. 72 hours after the infection the 
mono- layers were fixed with 3.7% formaldehyde. This treatment 
makes the outer membrane of the Chlamydia impermeable for 
antibodies due to the extensive cross-linking of the outer 
membrane proteins by the formaldehyde. The HeLa cells were 
permeabilized with 0.2% Triton XlOO, the monolayers were 
washed in PBS, then incubated with 20% (v/v) FCS to 
inactivate free radicals of the formaldehyde. The mice sera 
were diluted 1:100 PBS with 20% (v/v) FCS and incubated with 
the monolayers for half an hour. The monolayers were washed 
15 in PBS and secondary FITCH conjugated rabbit anti mouse serum 
was added for half an hour, and the monolayers were washed 
and mounted. Several of the antibodies reacted strongly with 
the EBs in the inclusions (Figure 9) . In spite of the 
formaldehyde fixation it could not be excluded that the 
surface of the EB was changed by the treatments, so that the 
antibodies could get access to the Omp4-7. Therefore, the 
reaction was confirmed by immuno-electron microscopy with the 
antibody raised against clone pEX3-36. Purified EB of C. 
pneumoniae were absorbed to carbon coated nickel grids. After 
the absorption the grids were washed with PBS and blocked in 
0.5% Ovalbumin dissolved in PBS. The antibodies were diluted 
1:100 in the same buffer and incubated for 30 minutes. The 
grids were washed in PBS. Rabbit anti mouse Ig conjugated 
with lOnm colloidal gold diluted in PBS containing 1% gelatin 
30 was added to the grids for half an hour. The grids were 

washed in 3 x PBS with 1% gelatin and 3 times in PBS, the 
grids were contrastained with 0.7% phospho tungstic acid. The 
grids were analysed in a Jeol 1010 electron microscope at 4 0 
kV. It was seen that the gold particles were covering the 
35 surface of the purified EB . Because the C. pneumoniae EBs 
were not exposed to any detergent or fixation under either 
Lhe pumicaLluu ui Lhn ii nrhnnn T7-^^^^ ^r^^ibodies. thP>c;^ 
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results show th-^t- f-v^^ t 

epitopes. ' ^'^"^^-^ exposed 

Polyclonal monospecific antibodies against 0.p4 

protein was expressed bv H "39-<J fusion 

ana puri.ied oL ! n.l "':: " ^^'-^^^^ 

was used for ^^.n.Z:::^ J^Z^ '^''^'^ 
time). rabbit (six times, 8 each 

use Of rabbit polyclonal antibodies to recn„,K- 
detection of Chlam^rdi. recombinant Omp4 for 

Chlan,ycl,^ pneumoniae in paraffin eznbedded 



sections 
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The lungs of c. pneumoniae infecteri • 

days after intranasal infectLr The ^"^^^ 
fixed in 4% formaldehyde parif f ■ T""' '^'"^'^^ ^^^^ 
-paraffi...ed prior I 

w.th the rabbit seru. diluted X 200 in TBS ri^r^ 

20mM Tris pH 7 5) for 30 • ^^^^ ' 

two times in TBS the sectlT "^^^ '^^-Perature . After wash 

secondary antibod; b It ::L\::^^ ^^"^'^^ ^^^^ 

diluted 1:300 in ^BS ^ anti-rabbit antibodies) 

sections were sLLed >h ''"^^^ " 

(strept^^comple: r J::^ f-P-vidin-biotin complex 

under microscopic inspect L k "^^^^^^^^ ' 

(Vector laboratories " " ^ 

Hematoxylin and analysed ny^^^^"^ ^^^"^^^ --"^ 

anTL::""^ ^^^^^"^ ^^^^ monospecific rabbit 

The insert of dryi i ^ 

containing LXC site "L^pc"." '"f"'" '''' "^'"^ 

inserted in the pET-32 Mc T 

b'C'i js^ i_,IC vector (Novaof^^n nt^ 

uNuvagen, UK cat No. 6 9076- 
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1, Thereby the insert sequence of the pEXl-l ciSne was 
expressed in the new vector as a fusion protein, the part of 
th! fusion protein encoded by the pET-32 LIC vector had 6 
histidina residues in a row. The expression of the fusron 
protein was induced in this vector, and the fusion protexn 
could be purified under denaturing condition on a Kr2. column 
due to the high affinity of the histidine residues to 
divalent cations. The purified protein was used for 
immunization of a «ew Zealand white rabbit. After ^ ^-es 
intramuscular and a times intravenous immunization the serum 
was obtained from the rabbit. Purified c. pneu^cnrae EB was 
dissolved in SBS-sample buffer. Half of the sample -s heated 
to loooc in the sample buffer, whereas the other half of the 
sample was not heated. The samples were separated by 
SDS-PAGE, and the proteins were transferred to 
nitrocellulose, the serum was reacted with the strips .Wrth 
the samples heated to 100»C the serum recognized a hrgh 
molecular weight band of approximately 9S .Da This is xn 
agreement with the predicted size of Omp5 , of which the 
pEXl-1 clone is a part, however, when the antibody was 
reacted to the strip with unheated EB, the pattern was 
different. Now a band was seen with a size of '^"'^ ^r.' 
addition weaker bands were observed above the band (Figure 
10) These data demonstrate that Omp5 needs boiling in 
SDS-sample buffer to be fully denatured and migrate with a 
size as predicted from the gene product. V,ben the -"P^^ 
„ere not boiled, the protein was not fully denatured and less 
were not . j v,., a more globular structure 

SDS binds to the protein and it has a more g 
that will migrate faster in the acrylamide gel. The band 
,0 pattern looked identical to what was obtained with a 

monoclonal antibody (HAb 26.1,, lane 6,, we ..rl.er U... 
described (Christiansen et al . , 19^4,, reacting with the 
surface of C. pneumoniae EB, but the antibody do not react 
with the fully SDS denatured C. pneumoniae EB in 
3 5 immunoblotting . 
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Experimental infection of C57 black mice 
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Due to the realization of the altered migration of the Omp4-7 
proteins without boiling, we chose to analyse antibodies 
against C. pneumoniae EBs after an experimental infection of 
5 mice. To obtain antibodies from an infection caused by C. 
pneumoniae. C57 black mice were inoculated intranasally with 
10^ CFI of C. pneumoniae under a light ether anaesthesia. 
After 14 days of infection the serum samples were obtained 
and the lungs were analysed for pathological changes. In two 

10 of the mice a severe pneumonia was observed in the lung 
sections, and in the third mouse only minor changes were 
observed. The serum from the mice was diluted I:100 and 
reacted with purified EBs dissolved in sample buffer with and 
without boiling, in the preparations that had been heated to 

15 lOCC the sera from two of the mice reacted strongly with 
bands of 60/62 kDa and weaker bands of 55 kDa, but no 
reaction was observed with proteins of the size of Orap4-7 
(Figure 11) . However, when the sera were reacted with the 
preparation that had not been heated they all had a strong 

20 reaction with a broad band of an approximate size of 75 kDa. 
This is in agreement with the size of the Omp4-7 proteins in 
the unheated preparation. Therefore, it could be concluded: 
that the epitopes of the Omp4-7 proteins recognized by the 
antibodies after a C. pneumoniae infection were discontinuous 

25 epitopes because the full denaturation of the antigen 
completely destroyed the epitopes. The 75 kDa protein 
observed in unheated samples is not Omp2 (Shown in 
immunoblotting with an Omp2 specific antibody) 

EXAMPLE 3 

Comparison of Omp4-7 of C. pneumoniae with putative outer 
membrane proteins (POMP) of C. psittaci 

Longbottom et al. (1996) have published partial sequence from 
98 to 90 kDa proteins from C. psittaci. They have entered the 
full sequence of 5 genes in this family in the EMBL database. 
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They have narked the genes "putative outer membralTe proteins" 
(POMP) since their precise location was not determrned. The 
family is composed of two genes that are completely 
identical, and two genes with high homology to these genes. 
S They calculated a molecular size of 90 and 91 kDa . The 5th 
encode a protein of 98 kDa . The sequence of the.Omp4-7 
proteins of C. pneumoniae were compared to the sequences of 
the C Psittaci POMP proteins with the programme pileup m 
the GCG package. The amino acid homologies were in the range 
10 of 51-63%. It is seen that the C. pneun^oniae Omp4-5 protexns 
are most related to the 98 kDa POMP protein of C. psittaci. 
interestingly, the 98 kDa C. psittaci POMP protein is more 
related to the C. pneumoniae genes than to the other C. 
psittaci genes. The repeated sequences of GGAI were conserved 
IS in the 98 kDa POMP protein, but only three GGAI repeats were 
present xn the 90 and 91 kDa C. psittaci POMP proteins. For 
C psittaci it has been shown that antibodies to these 
proteins seem to be protective for the infection. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANT 

(A) NAME: Svend B^^rKel-d ^^^^^.^^^^^ immunology. 

(R) STREET: Dept. ot Neaicc^ 

university of Arhus 

(C) CITY: Arhus C 

(D) STATE OR PROVINCE: 

(E) COUNTRY: Denmark 

(F) POSTAL CODE: 8000 

,,,, .XTLE OP THE INVENTION: Chlamydia pneumoniae anti 
gens 

(iii) NUMBER OF SEQUENCES: 30 

,iv) COMPUTER -READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS ^^^^.^^ ^ ^ 

(D) SOFTWARE: FastSEQ for wina 

(V) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 205... 2987 
(D) OTHER INFORMATION: 

SEQUENCE DESCRIPTION: SEQ ID N0:1: 

CAATGTCGAA GAGAGCACTA ACCAGGAAAA ^TGCJATTTC ^^A^J^^ UttStGAC 
^^StACT TGCGTCATAT AAAATAGAAA TCTATATTGA TGCGAATAGT 

^Sgttttg tcatctttaa cttgatttac TTATTTTG 

fcTCTAAAAA ACAAAAGCAT TACC ATG AAG AC^ TCG 



1 ^ 



Val Ser Ser vai ^^^^ Tq 



60 
120 
180 
231 



279 



15 
10 
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GAG GAA CTT TTA TCA CCT GAT GAT AGC TTT AAT GGA AAT ATC GAT Teft 327 

Glu Glu Leu Leu Ser Pro Asp Asp Ser Phe Asn Gly Asn He Asp Ser 
30 35 40 

GGA ACG TTT ACT CCA AAA ACT TCA GCC ACA ACA TAT TCT CTA ACA GGA 375 

Gly Thr Phe Thr Pro Lys Thr Ser Ala Thr Thr Tyr Ser Leu Thr Glv 
45 50 55 

GAT GTC TTC TTT TAC GAG CCT GGA AAA GGC ACT CCC TTA TCT GAC AGT 423 

Asp Val Phe Phe Tyr Glu Pro Gly Lys Gly Thr Pro Leu Ser Asp Ser 
60 65 70 

TGT TTT AAG CAA ACC ACG GAC AAT CTT ACC TTC TTG GGG AAC GGT CAT 4 71 

Cys Phe Lys Gin Thr Thr Asp Asn Leu Thr Phe Leu Gly Asn Gly His 

75 80 85 



AGC TTA ACG TTT GGC TTT ATA GAT GCT GGC ACT CAT GCA GGT GCT GCT 
Ser Leu Thr Phe Gly Phe He Asp Ala Gly Thr His Ala Gly Ala Ala 

105 



90 95 100 



GTA GTT GCT GGG AAT TTT TCT ACT GCA GAT GGT GGA GCT ATC AAA GGA 
Val Val Ala Gly Asn Phe Ser Thr Ala Asp Gly Glv Ala He Lys Gly 
155 160 165 



TCT ACG TCA GGA GGC GCT ATC GAT GAT GAA GGC ACG TCG ATA CTA TCG 
Ser Thr Ser Gly Gly Ala He Asp Asp Glu Gly Thr Ser He Leu Ser 
220 225 230 

AAC K\C AAA TTT CTA TAT TTT GAA GGG AAT GCA GCG AAA ACT ACT GGC 
Asn Asn Lys Phe Leu Tyr Phe Glu Gly Asn Ala Ala Lys Thr Thr Gly 
235 240 245 

GGT GCG ATC TGC AAC ACC AAG GCG AGT GGA TCT CCT GAA CTG ATA ATC 



519 



567 



GCA TCT ACA ACA GCA AAT AAG AAT CTT ACC TTC TCA GGG TTT TCC TTA 
Ala Ser Thr Thr Ala Asn Lys Asn Leu Thr Phe Ser Gly Phe Ser Leu 
110 115 ^20 

CTG AGT TTT GAT TCC TCT CCT AGC ACA ACG GTT ACT ACA GGT CAG GGA 615 
Leu Ser Phe Asp Ser Ser Pro Ser Thr Thr Val Thr Thr Gly Gin Gly 
125 130 135 

ACG CTT TCC TCA GCA GGA GGC GTA AAT TTA GAA AAT ATT CGT AAA CTT 663 
Thr Leu Ser Ser Ala Gly Gly Val Asn Leu Glu Asn He Arg Lys Leu 
140 145 150 



711 



GCG TCT TTC CTT TTA ACT GGC ACT TCT GGA GAT GCT CTT TTT AGT AAC 759 
Ala Ser Phe Leu Leu Thr Gly Thr Ser Gly Asp Ala Leu Phe Ser Asn 

175 180 

AAC TCT TCA TCA ACA AAG GGA GGA GCA ATT GCT ACT ACA GCA GGC GCT 80 7 

Asn Ser Ser Ser Thr Lys Gly Gly Ala He Ala Thr Thr Ala Gly Ala 

195 200 

CGC ATA GCA AAT AAC ACA GGT TAT GTT AGA TTC CTA TCT AAC ATA GCG 855 
Arg He Ala Asn Asn Thr Gly Tyr Val Arg Phe Leu Ser Asn He Ala 
205 210 215 



903 



951 



999 
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Gly Ala lie Cys Asn Thr Lys Ala Ser Gly Ser Pro Glu Leu lie He 
250 255 260 265 

TCT AAC AAT AAG ACT CTG ATC TTT GCT TCA AAC GTA GCA GAA ACA AGC 104 7 

Ser Asn Asn Lys Thr Leu lie Phe Ala Ser Asn Val Ala Glu Thr Ser 
270 275 280 

GGT GGC GCC ATC CAT GCT AAA AAG CTA GCC CTT TCC TCT GGA GGC TTT 10 9 5 

Gly Gly Ala He His Ala Lys Lys Leu Ala Leu Ser Ser Gly Gly Phe 
285 290 295 

ACA GAG TTT CTA CGA AAT AAT GTC TCA TCA GCA ACT CCT AAG GGG GGT 114 3 

Thr Glu Phe Leu Arg Asn Asn Val Ser Ser Ala Thr Pro Lys Gly Gly 
300 305 310 

GCT ATC AGC ATC GAT GCC TCA GGA GAG CTC AGT CTT TCT GCA GAG ACA 1191 
Ala He Ser He Asp Ala Ser Gly Glu Leu Ser Leu Ser Ala Glu Thr 
315 320 325 

GGA AAC ATT ACC TTT GTA AGA AAT ACC CTT ACA ACA AXC GGA AGT ACC 12 3 9 

Gly Asn He Thr Phe Val Arg Asn Thr Leu Thr Thr Thr Gly Ser Thr 
330 335 340 345 

GAT ACT CCT AAA CGT AAT GCG ATC AAC ATA GGA AGT AAC GGG AAA TTC 12 8 7 

Asp Thr Pro Lys Arg Asn Ala He Asn He Gly Ser Asn Gly Lys Phe 
350 355 360 

ACG GAA TTA CGG GCT GCT AAA AAT CAT ACA ATT TTC TTC TAT GAT CCC 133 5 

Thr Glu Leu Arg Ala Ala Lys Asn His Thr He Phe Phe Tyr Asp Pro 
365 370 375 

ATC ACT TCA GAA GGA ACC TCA TCA GAC GTA TTG AAG ATA AAT AAC GGC 13 83 

He Thr Ser Glu Gly Thr Ser Ser Asp Val Leu Lys He Asn Asn Gly 
380 385 390 

TCT GCG GGA GCT CTC AAT CCA TAT CAA GGA ACG ATT CTA TTT TCT GGA 1431 
Ser Ala Gly Ala Leu Asn Pro Tyr Gin Gly Thr He Leu Phe Ser Gly 
395 400 405 

GAA ACC CTA ACA GCA GAT GAA CTT AAA GTT GCT GAC AAT TTA AAA TCT 147 9 

Glu Thr Leu Thr Ala Asp Glu Leu Lys Val Ala Asp Asn Leu Lys Ser 
410 415 420 425 

TCA TTC ACG CAG CCA GTC TCC CTA TCC GGA GGA AAG TTA TTG CTA CAA 152 7 

Ser Phe Thr Gin Pro Val Ser Leu Ser Gly Gly Lys Leu Leu Leu Gin 
430 435 440 

AAG GGA GTC ACT TTA GAG AGC ACG AGC TTC TCT CAA GAG GCC GGT TCT 157 5 

Lys Gly Val Thr Leu Glu Ser Thr Ser Phe Ser Gin Glu Ala Gly Ser 
445 450 455 

CTC CTC GGC ATG GAT TCA GGA ACG ACA TTA TCA ACT ACA GCT GGG AGT 162 3 

Leu Leu Gly Met Asp Ser Gly Thr Thr Leu Ser Thr Thr Ala Gly Ser 
460 465 470 



ATT ACA ATC ACG AAC CTA QQA ATC AAT GTT QAC TCC TTA GGT CTT AAQ 1071 

He Thr He Thr Asn Leu Gly He Asn Val Asp Ser Leu Gly Leu Lys 
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475 

CAG CCC GTC 
Gin Pro Val 
490 

TCT GGG AAG 
Ser Gly Lys 

CAT ATG TTC 
His Met Phe 



GAT GOT GAT 
Asp Ala Asp 
540 

OCT GOT GAG 
Pro Ala Glu 
555 

GTT AAT TGG 
Val Asn Trp 
570 

ACT TGG ACC 
Thr Trp Thr 



TTA GTA TGC 
Leu Val Cys 



CAA CAG CTT 
Gin Gin Leu 
620 

TTC TGG GTT 
Phe Trp Val 
635 

AAT CGC AAA 
Asn Arg Lys 
650 

AGT GCT CAC 
Ser Ala His 



CTC TTT GCT 
Leu Phe Ala 



ACC TAC GGT 
Thr Tyr Gly 
700 



AGC CTA ACA 
Ser Leu Thr 
495 

CTC AAC CTG 
Leu Asn Leu 
510 

AGC CAT GAC 
Ser His Asp 
525 

GTT GAT ACT 
Val Asp Thr 

GAT CCT AAT 
Asp Pro Asn 

ACT ACG GAT 
Thr Thr Asp 
575 

AAA ACA GGA 
Lys Thr Gly 
590 

AAT ACC CTA 
Asn Thr Leu 
605 

GTA GAG ATC 
Val Glu He 



TCC TCC ATG 
Ser Ser Met 



GGC TTC CGT 
Gly Phe Arg 
655 

ACT CCT AAA 
Thr Pro Lys 
670 

AGA GAC AAA 
Arg Asp Lys 
685 

GGA ACT TTA 
Gly Thr Leu 



480 

GCA AAA GGT 
Ala Lys Gly 

ATT GAT ATT 
He Asp He 

CAG CTC TTC 
Gin Leu Phe 
530 

AAC GTT GAC 
Asn Val Asp 
545 

TCA GAA TAC 
Ser Glu Tyr 
560 

ACA GCT ACA 
Thr Ala Thr 



TTT GTT CCC 
Phe Val Pro 



TGG GGA GTC 
Trp Gly Val 
610 

GGC GCA ACT 
Gly Ala Thr 
625 

ACG AAC TTC 
Thr Asn Phe 
640 

CAT ACC TCT 
His Thr Ser 



GAC GAC CTA 
Asp Asp Leu 

GAT TGT TTT 
Asp Cys Phe 
690 

TTC TTC AAG 
Phe Phe Lys 
705 



485 

GCT TCA AAT 
Ala Ser Asn 
500 

GAA GGG AAC 
Glu Gly Asn 
515 

TCT CTA TTA 
Ser Leu Leu 



ATC AGC AGC 
He Ser Ser 



GGA TTC CAA 
Gly Phe Gin 
565 

AAT ACA AAA 
Asn Thr Lys 
580 

AGC CCC GAA 
Ser Pro Glu 
595 

TTT ACT GAC 
Phe Thr Asp 



GGT .ATG GAA 
Gly Met Glu 



CTG CAT AAG 
Leu His Lys 
645 

GGA GGC TAC 
Gly Gly Tyr 
660 

TTT ACC TTT 
Phe Thr Phe 
675 

ATC GCT CAC 
He Ala His 



CAC TCT CAT 
His Ser His 



AAA GTG ATC 
Lys Val He 



ATT TAT GAA 
He Tyr Glu 
520 

AAA ATC ACG 
Lys He Thr 
535 

CTT ATC CCT 
Leu He Pro 
550 

GGA CAA TGG 
Gly Gin Trp 



GAG GCC ACG 
Glu Ala Thr 

AGA. AAA TCT 
Arg Lys Ser 
600 

ATT CGC TCT 
He Arg Ser 
615 

CAC AAA CAA 
His Lys Gin 
630 

ACT GGA GAT 
Thr Gly Asp 



GTC ATC GGT 
Val He Gly 



GCG TTC TGC 
Ala Phe Cys 
680 

AAC AAC TCT 
Asn Asn Ser 
695 

ACC CTA CAA 
Thr Leu Gin 
710 



GTA 1719 

Val 

505 

act" 1767 
Ser 



GTT 1815 
Val 



GTT 1863 
Val 



AAT 1911 
Asn 



GCA 195 9 

Ala 

585 

GCG 2 00 7 
Ala 



CTG 2 05 5 

Leu 



GGT 2103 
Gly 



GAA 2151 
Glu 



GGA 2199 

Gly 

665 

CAT 2 24 7 

His 



AGA 2295 
Arg 



CCC 2343 
Pro 
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... .nr TZ.T TTG AGA TTA GGA AGA GCA AAG TTT TCT GAA TCA GCT ATA 
ITn Z lyr Teu Tr, Ueu Gly A.. Ala .ys Phe Se. Glu Se. Ala Xle 



715 ''^O 



... r^T^r. TTC CCT AGG GAA ATT CCC CTA GCC TTG GAT GTC CAA GTT TCG 
IT. III Glu Xle pro .eu Ala .eu Asp Val Gin Val Se. 

730 735 

TTC AGC CAT TCA GAC AAC CGT ATG GAA ACG CAC TAT ACC TCA TTG CCA 
7he Ser nfs Ser Asp Asn Arg Met Glu Thr His Tyr Thr Ser Leu Pro 
750 755 

rn^ TCC GAA GGT TCT TGG AGC AAC GAG TGT ATA GCT GGT GGT ATC GGC 
ser G^ Gl^ Ser Trp Ser Asn Glu Cys He Ala Gly Gly He Gly 
765 770 

CTA GAC CTT CCT TTT GTT CTT TCC AAC CCA CAT CCT CTT TTC AAG ACC 
Su ASP Su pro Phe Val Leu Ser Asn Pro His Pro Leu Phe Lys Thr 

780 '^^^ 

TTC ATT CCA CAG ATG AAA GTC GAA ATG GTT TAT GTA TCA CAA AAT AGC 
P^e ife pro Gin Me. Lys Val Glu Met Val Tyr Val Ser Gin Asn Ser 



795 



800 



TTC TTC GAA AGC TCT AGT GAT GGC CGT GGT TTT AGT -ATT GGA AGG CTG 
III III IT. ser ser Ser Asp Gly Ar. Gly Phe Ser Xle Gly Arg Leu 



810 



815 



.TT Azxr TTC TCG ATT CCT GTG GGT GCG AAA TTC GTG CAG GGG GAT ATC 
^n Teu ler xfe Pro Val Gly Ala Lys Phe Val Gin Gly Asp Xle 



2391 



2439 



2487 



2535 



2583 



2631 



2679 



2727 



830 



CGA GAT TCC TAC ACC TAT GAT CTC TCA GGA TTC TTT GTT TCC GAT GTC 
GGA GAT iL.^ ^ ^ -L p^^p val 

Gly Asp Ser Tyr Thr Tyr Asp Leu Ser Gly Pne Fne 



845 



TAT CGT AAC AAT CCC CAA TCT ACA GCG ACT CTT GTG ATG AGC CCA GAC 
lyl Arg Asn Pro Gin Ser Thr Ala Thr Leu Val Me. Ser Pro Asp 



860 



TCT TGG AAA ATT CGC GGT GGC AAT CTT TCA AGA CAG GCA TTT TTA CTG 
IZ IZ Z ne Are Gly Gly Asn Leu Ser Arg Gin Ala Phe Leu Leu 



875 



880 



.GG GGT AGC AAC AAC TAC GTC TAC AAC TCC AAT TGT GAG CTC TTC GGA 
Arg Gly Ser Asn Asn Tyr val Tyr Asn Ser Asn Cys Glu Leu G^y 



890 



895 



CAT TAC GCT ATG GAA CTC CGT GGA TCT TCA AGG AAC TAC AAT GTA GAT 
Ss ^yr Ala Met Glu Leu Arg Gly Ser Ser Arg Asn Tyr Asn Val Asp 



2775 



2823 



2871 



2919 



2967 



910 



GTT GGT ACC AAA CTC CGA TT CTAGATTGCT AAAACTCCCT AGTTCTTCTA GGGAG 3022 



Val Gly Thr Lys Leu Arg Phe 



TTTTCTCATA CTTTTAGGGA AATATTTGCT ATAGGGAATG CTTTCCTTGC AAACTGTAAA 3082 
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AAATAACATT TGTCCCTCTT CAAAAAAGAT TTCTT-rrz..^ 

TTXXAAAAAC AC^AAA.AA ..AAXAOACA 11^.^ -J-TTTxa 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS- 
(A) LENGTH: 92 8 amino acids 
(B> TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Lys Thr Ser He Pro T,-r. w i r 

J- ±±e fro Trp Val Leu Val Ser Sot- u=i r 
1 5 ^'^^ ^^"^ Val Leu Ala Phe 

„.3 0>„ se. .e„ . ^ 

S„ o.. 3„ 30^ 

Ser Ala Thr Thr Tvt- c^^^ t 

Tyr sar Thr Gly Asp val Phe Phe Tyr olu Pro 

-V .y= ..y .hr pro .e. Ser ..p ser Cy, Phe H 

Leu Thr Phe .eu O.y ,,y 3^ ^ SO^ 

Asp Ala Gly Thr His Ala Gly Ala Al;, ^^ o 

100 ""J^ Thr Thr Ala Asn Lys 

Asn Leu Thr Phe Ser Gly Phe Ser E^n i o 

115 ser Phe Asp Ser Ser Pro 

Ser Thr Thr Val Thr Thr Gly Gin Glv Th r 

130 ^-^^ Thr Leu Ser Ser Ala Gly Gly 

Val Asn Leu Glu A«!n a , 1'^° 

II, Arg Lys .e. Val Val Ala Gly As„ Phe Ser 

-r Ala A,p Oly Gly Ala Ue .y, Oly Ala III ph. .eu .eu .hr 
- ser Gly A.p Ala .e„ Ph, Ser A.„ 12 ser Ser Ser .hr HI Gly 

Gly Ala He Ala Thr Thr Ala Gly Ala Ara Ti "° 

195 lie Ala Asn Asn Thr Gly 

Tyr Val Arg Phe Leu Ser Asn ill Ala Ser Th . 

210 Ser Thr Ser Gly Gly Ala He 

Asp Asp Glu Gly Thr Ser tIo r o 

230 " '''' ^he Leu Tyr Phe 

Glu Gly Asn Ala Ala Lys Thr Thr Gly Gly HI , 240 

245 Asn Thr Lys 

Ala Ser Gly Ser Pro Glu Leu He He lit a . ^55 

260 III ^^"^ Asn Lys Thr Leu He 

Phe Ala ser Asn Val Ala Glu Thr Ser Glv r, '^'^ 

275 ^^"^ '^ly Gly Ala He His Ala Lys 

LVS ..u Ala ser Ser 01 oly Phe .hr Gl„ Phe III Ar, A3„ A.„ 
val ser Ser Ala .hr Pro .y. oly oly Ala „e IH Asp Ala Ser 



3142 
32O0 
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305 










J X u 










315 










320 


Gly Glu 


Leu 


Q (a ■r 


Leu 


ber 


Ala 


Glu 


Thr 


Gly 


Asn 


He 


Thr 


Phe 


Val 


Arg 










J ^ D 










330 










335 


Asn 


Thr 


Leu 


1 ill 


i nr 


i nr 


Gly 


Ser 


Thr 


Asp 


Thr 


Pro 


Lys 


Arg 


Asn 


Ala 








J 4 U 










345 










350 






I le 


Asn 


i xe 




ber 


Asn 


Gly 


Lys 


Phe 


Thr 


Glu 


Leu 


Arg 


Ala 


Ala 


Lys, 






s 

J J J 










3 60 










365 






Asn 


His 


I nr 


i xe 


Fne 


Pne 


Tyr 


Asp 


Pro 


He 


Thr 


Ser 


Glu 


Gly 


Thr 


Ser 




370 










7 T 

J / b 










380 










Ser 


Asp 


vax 


Leu 


Lys 


He 


Asn 


Asn 


Gly 


Ser 


Ala 


Gly Ala 


Leu 


Asn 


Pro 


385 










J y u 










395 










400 


Tyr 


Gin 


vjx y 


i nr 


X xe 


Leu 


Phe 


Ser 


Gly 


Glu 


Thr 


Leu 


Thr 


Ala 


Asp Glu 










^ U 3 










410 










415 




Leu 


Lys 


va jl 


AX a 


Asp 


Asn 


Leu 


Lys 


Ser 


Ser 


Phe 


Thr 


Gin 


Pro 


Val 


Ser 








4 ^ u 










425 










430 






Leu 


Ser 


Cjly 


Gly 


Lys 


Leu 


Leu 


Leu 


Gin 


Lys 


Gly 


Val 


Thr 


Leu 


Glu 


Ser 






-J Z> 










440 










445 








Thr 


Ser 


Phe 


Ser 


Gin 


Glu 


Ala 


Gly 


Ser 


Leu 


Leu 


Gly Met 


Asp 


Ser 


Gly 




450 










455 










460 








Thr 


Thr 


Leu 


Ser 


Thr 


Thr 


Ala 


Gly 


Ser 


He 


Thr 


He 


Thr 


Asn 


Leu Glv 


465 










470 










475 










480 


He 


Asn 


Val 


Asp 


Ser 


Leu 


Gly 


Leu 


Lys 


Gin 


Pro 


Val 


Ser 


Leu 


Thr 


Ala 










485 










490 










495 




Lys 


Gly 


Ala 


Ser 


Asn 


Lys 


Val 


He 


Val 


Ser 


Gly 


Lys 


Leu 


Asn 


Leu 


He 








500 










505 










510 






Asp 


He 


Glu 


Gly Asn 


He 


Tyr 


Glu 


Ser 


His 


Met 


Phe 


Ser 


His 


Asp 


Gin 






bib 










520 










52 5 






Leu 


Phe 


Ser 


Leu 


Leu 


Lys 


He 


Thr 


Val 


Asp 


Ala 


Asp 


Val 


Asp 


Thr 


Asn 




530 










535 










540 








Val 


Asp 


He 


Ser 


Ser 


Leu 


He 


Pro 


Val 


Pro 


Ala 


Glu 


Asp 


Pro 


Asn 


Ser 


545 










550 










555 








560 


Glu 


Tyr 


Gly 


Phe 


Gin 


Gly 


Gin 


Trp 


Asn 


Val 


Asn 


Trp 


Thr 


Thr 


Asp Thr 










565 










570 










575 




Ala 


Thr 


Asn 


Thr 


Lys 


Glu 


Ala 


Thr 


Ala 


Thr 


Trp 


Thr 


Lys 


Thr Gly 


Phe 








5 80 










585 










590 






Val 


Pro 


oer 


Pro 


Glu 


Arg 


Lys 


Ser 


Ala 


Leu 


Val 


Cys 


Asn 


Thr 


Leu 


Trp 






^ Q 










600 










605 






Gly Val 


Fne 


Thr 


Asp 


He 


Arg 


Ser 


Leu 


Gin 


Gin 


Leu 


Val 


Glu 


He 


Gly 




610 










615 










620 








Ala 


Thr 


r* 1 t r 

kjiy 


Met 


Glu 


His 


Lys 


Gin 


Gly 


Phe 


Trp 


Val 


Ser 


Ser 


Met 


Thr 


625 










o J U 










635 










640 


Asn 


Phe 


Leu 


His 


Lys 


Thr 


Gly 


Asp 


Glu 


Asn 


Arg 


Lys 


Gly 


Phe 


Arg 


His 










645 










650 










655 




Thr 


Ser 


Gly 


Gly Tyr 


Val 


He 


Gly 


Gly 


Ser 


Ala 


His 


Thr 


Pro 


Lys 


Asp 








660 










665 










670 


Asp 


Leu 


fne 


Thr 


Phe 


Ala 


Phe 


Cys 


His 


Leu 


Phe 


Ala 


Arg 


Asp 


Lys 


Asp 






O / D 










680 










685 


Cys 


Phe 


X xe 


Ala 


His 


Asn 


Asn 


Ser 


Arg 


Thr 


Tyr 


Gly Gly Thr 


Leu 


Phe 


Phe 


690 










695 










700 










Lys 


His 


Ser 


His 


Thr 


Leu 


Gin 


Pro 


Gin 


Asn 


Tyr 


Leu 


Arg 


Leu Glv 


705 










710 










715 








720 


Arg 


Ala 


Lys 


Phe 


Ser 


Glu 


Ser 


Ala 


He 


Glu 


Lys 


Phe 


Pro 


Arg 


Glu 


He 










725 










730 








735 




Pro 


Leu 


Ala 


Leu 


Asp 


Val 


Gin 


Val 


Ser 


Phe 


Ser 


His 


Ser Asp Asn Arg 




Glu 
























750 






Met 


Thr 


His 


Tyr 


Thr 


Ser 


Leu 


Pro 


Glu 


Ser" 


Glu" 


Gly 


ser" 


Trp' 








755 










760 










765 
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Asn Glu Cys lie Ala Gly Gly lie Gly Leu Asp Leu Pro Phe Val "LeTi 

770 775 780 

Ser Asn Pro His Pro Leu Phe Lys Thr Phe lie Pro Gin Met Lys Val 
"^e^ 795 800 

Glu Met Val Tyr Val Ser Gin Asn Ser Phe Phe Glu Ser Ser Ser Asp 

B05 810 815 

Gly Arg Gly Phe Ser lie Gly Arg Leu Leu Asn Leu Ser He Pro Val 

920 825 830 

Gly Ala Lys Phe Val Gin Gly Asp He Gly Asp Ser Tyr Thr Tyr Asp 

835 840 B45 

Leu Ser Gly Phe Phe Val Ser Asp Val Tyr Arg Asn Asn Pro Gin Ser 

950 855 860 

Thr Ala Thr Leu Val Met Ser Pro Asp Ser Trp Lys He Arg Gly Gly 
865 870 875 880 

Asn Leu Ser Arg Gin Ala Phe Leu Leu Arg Gly Ser Asn Asn Tyr Val 

885 890 895 

Tyr Asn Ser Asn Cys Glu Leu Phe Gly His Tyr Ala Met Glu Leu Arg 

900 905 910 

Gly Ser Ser Arg Asn Tyr Asn Val Asp Val Gly Thr Lys Leu Arg Phe 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

ATGAAATCGC AATTTTCCTG GTTAGTGCTC TCTTCGACAT TGGCATGTTT TACTAGTTGT 6 0 

TCCACTGTTT TTGCTGCAAC TGCTGAAAAT ATAGGCCCCT CTGATAGCTT TGACGGAAGT 120 

ACTAACACAG GCACCTATAC TCCTAAAAAT ACGACTACTG GAATAGACTA TACTCTGACA 180 

GGAGATATAA CTCTGCAAAA CCTTGGGGAT TCGGCAGCTT TAACGAAGGG TTGTTTTTCT 240 

GACACTACGG AATCTTTAAG CTTTGCCGGT AAGGGGTACT CACTTTCTTT TTTAAATATT 300 

AAGTCTAGTG CTGAAGGCGC AGCACTTTCT GTTACAACTG ATAAAAATCT GTCGCTAACA 360 

GGATTTTCGA GTCTTACTTT CTTAGCGGCC CCATCATCGG TAATCACAAC CCCCTCAGGA 420 

AAAGGTGCAG TTAAATGTGG AGGGGATCTT ACATTTGATA ACAATGGAAC TATTTTATTT 4 80 

AAACAAGATT ACTGTGAGGA AAATGGCGGA GCCATTTCTA CCAAGAATCT TTCTTTGAAA 540 

AACAGCACGG GATCGATTTC TTTTGAAGGG AATAAATCGA GCGCAACAGG GAAAAAAGGT 600 

GGGGCTATTT GTGCTACTGG TACTGTAGAT ATTACAAATA ATACGGCTCC TACCCTCTTC 660 

TCGAACAATA TTGCTGAAGC TGCAGGTGGA GCTATAAATA GCACAGGAAA CTGTACAATT 720 

ACAGGGAATA CGTCTCTTGT ATTTTCTGAA AATAGTGTGA CAGCGACCGC AGGAAATGGA 780 

GGAGCTCTTT CTGGAGATGC CGATGTTACC ATATCTGGGA ATCAGAGTGT AACTTTCTCA 840 

GGAAACCAAG CTGTAGCTAA TGGCGGAGCC ATTTATGCTA AGAAGCTTAC ACTGGCTTCC 900 

GGGGGGGGGG GGGGTATCTC CTTTTCTAAC AATATAGTCC AAGGTACCAC TGCAGGTAAT 960 

GGTGGAGCCA TTTCTATACT GGCAGCTGGA GAGTGTAGTC TTTCAGCAGA AGCAGGGGAC 1020 

ATTACCTTCA ATGGGAATGC CATTGTTGCA ACTACACCAC AAACTACAAA AAGAAATTCT 1080 

ATTGACATAG GATCTACTGC AAAGATCACG AATTTACGTG CAATATCTGG GCATAGCATC 1140 

TTTTTCTACG ATCCGATTAC TGCTAATACG GCTGCGGATT CTACAGATAC TTTAAATCTC 1200 

AATAAGGCTG ATGCAGGTAA TAGTACAGAT TATAGTGGGT CGATTGTTTT TTCTGGTGAA 1260 
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AAGCTCTCTG 
GTAACTCTAA 
TTTACTCAGA 
ACAGAGGAGG 
AAAGTTGTAA 
CTTTTGGATA 
TCATTTGTGC 
ACAGTAGCAA 
GATACCGCAA 
CTTCCGAATC 
GACATCCAAG 
GGCTTCTGGG 
AAATACCGTC 
AACTTAATTA 
AAAAATCATA 
GGGTTCATAG 
TTAGAAGGGC 
TATCCTGAGG 
TCTCATTCTT 
AATCTGACCT 
GATGACAGCA 
GATTGTAATG 
GATCCCAAAT 
AACTTAGCAC 
TTTGAAGTGC 
GATCTTGGGG 



AAGATGAAGC 
CTGCAGGAAA 
CCGCGGGTTC 
TCACTTTAAC 
TTGCTGCTTC 
ACCAAGGGAA 
AGCTCTCTGC 
CTCCTACGCA 
GCACTCCAAA 
CTGAGCGTCA 
CGATTCAAGG 
CTGCGGGAGT 
ATAAATCTGG 
GCTTTGCCTT 
CTGATACCTA 
GTTGTCTCTT 
AGCTCGCTTA 
TGAAAGGTTC 
ATCCTGAATA 
ATATACGTCA 
ACCTCTTCAA 
ACTTTTCTTA 
GCACTACAGC 
GACAGGCCTT 
TCGGCCAGTT 
GTAAGTTCCA 



AAAAGTTGCA 
TTTAGTACTT 
CTCTGTTATT 
AGGTCTTTCC 
TGCAGCAAGT 
TGCTTATGAA 
TCTGGGTACT 
CTATGGGTAT 
GACTAAGACA 
AGGACCTTTA 
TGTCATAGAG 
CGCCAATTTC 
TGGATATGCT 
TTGCCAACTC 
TGCAGGAGCC 
AGATAAACTT 
TAGCCACGTC 
TTGGGGGAAT 
CCTGCATTGT 
GGACAGCTTC 
TTTATCTTTG 
TGATCTGACT 
ACTTGTAATC 
GCAAGTGCGT 
TGTCTTTGAA 
ATTCTAGGAG 



GACAACCTCA 
AAACGTGGTG 
ATGGATGCGG 
ATTCCTGTAG 
AAAAATGTAG 
AATCACGACT 
GCAACAACTA 
CAAGGTACTT 
GCGACATTAG 
GTTCCTAATA 
AGAAGTGCTT 
TTAGATAAAG 
ATCGGAGGTG 
TTTGGTAGCG 
TTCTATATCC 
CCTGGCTCTT 
AGTAATGATC 
AATGCTTTTA 
TTTGATACCT 
TCGGAGAAAG 
CCTATAGGGG 
TTATCCTATG 
AGCGGAGCCT 
GCAGGCAGTC 
GTTCGTGGAT 
CGTCTCTCAT 



CTTCTACGCT 
TCACTCTCGA 
GCACAACGTT 
ACTCTTTAGG 
CCCTTAGTGG 
TAGGAAAAAC 
CAGATGTTCC 
GGGGAATGAC 
CTTGGACCAA 
GCCTTTGGGG 
TGACTCTTTG 
ATAAGAAAGG 
CAGCGCAAAC 
ATAAAGATTT 
AACACATTAC 
GGAGTCATAA 
TGAAGACAAA 
ACATGATGTT 
ATGCTCCATA 
GTACAGAAGG 
TGAAGTTTGA 
TTCCTGATCT 
CTTGGGAAAC 
ACTACGCCTT 
CCTCACGGAT 
GTCTCAGAAA 



GAAGCAGCCT 
TACGAAAGGC 
AAAAGCAAGT 
CGAGGGTAAG 
TCCGATTCTT 
TCAAGACTTT 
AGCGGTTCCT 
TTGGGTTGAT 
TACAGGCTAC 
ATCTTTTTCA 
TTCAGATCGA 
GGAAAAACGC 
TTGTTCTGAA 
CTTAGTCGCT 
AGAATGTAGT 
ACCCCTCGTT 
GTATACTGCG 
GGGAGCTTCT 
CATCAAACTG 
AAGATCTTTT 
GAAGTTCTCT 
TATCCGCAAT 
TTATGCCAAT 
CTCTCCTATG 
TTATAATGTA 
TTCTG 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 928 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE 

(xi) SEQUENCE 

Met Lys Ser Gin Phe 

1 5 
Phe Thr Ser Cys Ser 
20 

Pro Ser Asp Ser Phe 
35 

Lys Asn Thr Thr Thr 
50 

Leu Gin Asn Leu Gly 
65 

Asp Thr Thr Glu Ser 
85 

Phe Leu Asn lie Lys 
100 

Thr Asp Lys Asn Leu 

115 

Ala Ala Pro Ser Ser 
130 



TYPE: peptide 

DESCRIPTION: SEQ ID 

Ser Trp Leu Val Leu 
10 

Thr Val Phe Ala Ala 
25 

Asp Gly Ser Thr Asn 
40 

Gly He Asp Tyr Thr 
55 

Asp Ser Ala Ala Leu 
70 

Leu Ser Phe Ala Gly 
90 

Ser Ser Ala Glu Gly 
105 

Ser Leu Thr Gly Phe 
120 

7a i lie" 
135 



NO : 4 : 

Ser Ser 

Thr Ala 

Thr Gly 

Leu Thr 
60 

Thr Lys 
75 

Lys Gly 
Ala Ala 
Ser Ser 



Thr 

Glu 

Thr 

45 

Gly 

Gly 

Tyr 

Leu 



Leu Ala 
15 

Asn He 
30 

Tyr Thr 

Asp He 

Cys Phe 

Ser Leu 

95 
Ser Val 
110 

Thr Phe 



Cys 

Gly 

Pro 

Thr 

Ser 

80 

Ser 

Thr 

Leu 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2815 




140 
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Lys Cys Gly Gly 
145 

Lys Gin Asp Tyr 

Leu Ser Leu Lys 
180 

Ser Ser Ala Thr 
195 

Val Asp lie Thr 
210 

Ala Glu Ala Ala 
225 

Thr Gly Asn Thr 

Ala Gly Asn Gly 
260 

Gly Asn Gin Ser 
275 

Gly Ala lie Tyr 
290 

Gly He Ser Phe 
305 

Gly Gly Ala He 

Glu Ala Gly Asp 
340 

Pro Gin Thr Thr 
355 

He Thr Asn Leu 
370 

Pro He Thr Ala 
385 

Asn Lys Ala Asp 

Phe. Ser Gly Glu 
420 

Leu Thr Ser Thr 
435 

Val Leu Lys Arg 
450 

Ala Gly Ser Ser 
465 

Thr Glu Glu Val 

Gly Glu Gly Lys 
500 

Val Ala Leu Ser 
515 

Tyr Glu Asn His 
530 

Leu Ser Ala Leu 
545 

Thr Val Ala Thr 

Thr Trp Val Asp 
580 

Leu Ala Trp Thr 



Asp Leu Thr Phe 
150 

Cys Glu Glu Asn 
165 

Asn Ser Thr Gly 

Gly Lys Lys Gly 
200 

Asn Asn Thr Ala 
215 

Gly Gly Ala He 
230 

Ser Leu Val Phe 
245 

Gly Ala Leu Ser 

Val Thr Phe Ser 
280 

Ala Lys Lys Leu 
295 

Ser Asn Asn He 
310 

Ser He Leu Ala 
325 

He Thr Phe Asn 

Lys Arg Asn Ser 
360 

Arg Ala He Ser 
375 

Asn Thr Ala Ala 
390 

Ala Gly Asn Ser 
405 

Lys Leu Ser Glu 

Leu Lys Gin Pro 
440 

Gly Val Thr Leu 
455 

Val He Met Asp 
470 

Thr Leu Thr Gly 
485 

Lys Val Val He 

Gly Pro He Leu 
520 

Asp Leu Gly Lys 
535 

Gly Thr Ala Thr 
550 

Pro Thr His Tyr 
565 

Asp Thr Ala Ser 
Asn Thr Gly Tyr 



44 

Asp Asn Asn Gly 
155 

Gly Gly Ala He 
170 

Ser He Ser Phe 
185 

Gly Ala He Cys 

Pro Thr Leu Phe 
220 

Asn Ser Thr Gly 
235 

Ser Glu Asn Ser 
250 

Gly Asp Ala Asp 
265 

Gly Asn Gin Ala 

Thr Leu Ala Ser 
300 

Val Gin Gly Thr 
315 

Ala Gly Glu Cys 
330 

Gly Asn Ala He 
345 

He Asp He Gly 

Gly His Ser He 
380 

Asp Ser Thr Asp 
395 

Thr Asp Tyr Ser 
410 

Asp Glu Ala Lys 
425 

Val Thr Leu Thr 

Asp Thr Lys Gly 
460 

Ala Gly Thr Thr 
475 

Leu Ser He Pro 
490 

Ala Ala Ser Ala 
505 

Leu Leu Asp Asn 

Thr Gin Asp Phe 
540 

Thr Thr Asp Val 
555 

Gly Tyr Gin Gly 
570 

Thr Pro Lys Thr 
585 

Leu Pro Asn Pro 



Thr He Leu Phe 
160 

Ser Thr Lys Asn 
175 

Glu Gly Asn Lys 
190 

Ala Thr Gly Thr" 
205 

Ser Asn Asn He 

Asn Cys Thr He 
240 

Val Thr Ala Thr 
255 

Val Thr He Ser 
27 0 

Val Ala Asn Gly 
285 

Gly Gly Gly Gly 

Thr Ala Gly Asn 
320 

Ser Leu Ser Ala 
335 

Val Ala Thr Thr 
350 

Ser Thr Ala Lys 
365 

Phe Phe Tyr Asp 

Thr Leu Asn Leu 
400 

Gly Ser He Val 
415 

Val Ala Asp Asn 
430 

Ala Gly Asn Leu 
445 

Phe Thr Gin Thr 

Leu Lys Ala Ser 
480 

Val Asp Ser Leu 
495 

Ala Ser Lys Asn 
510 

Gin Gly Asn Ala 
525 

Ser Phe Val Gin 

Pro Ala Val Pro 
560 

Thr Trp Gly Met 
575 

Lys Thr Ala Thr 
590 

Glu Arg Gin Gly 
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..o .e. .sn Se. .eu X.p O.y Se. P.e Se. .sp OXn Ma 

..e III Olv va. Xle OXu ... Se. Ma .eu T.r .eu Cys Se. ^sp .r. 

P.e .la Ma Z VaX Ma .sn P.e .eu .sp .ys .sp .ys .y. 

r! Tvr Ara His Lys Ser Gly Gly Tyr Ma He Gly 
Gly Glu Lys Arg Lys Tyr Arg His uy^ 

560 T Tio qat- Phe Ala Phe Cys 

Oly Ala Ala Gin Thr Cys Ser Glu Asn Leu He Ser Phe 

III Glv ser ASP Lys Lp Phe Leu Val Ala Lys Asn His Thr 

Gin Leu Phe Gly ser Asp 

..p Z TV. "a O.. P.. TV. X.. ol„ «i. ne TH. CV3 S.r 

III ... U, a.v CVS «p .v= .eu P« o.v Se. T.p S,. His 

P„ ... V.I Zl ... =1V Oln .eu «. TV. se. His VaX S.. ^sn 
.V. .vs TV. T.. .ia TV. P.O Ci. v,i .vs CiV Se. T.. 

'^^^ T r-iw Ala Ser Ser His Ser Tyr 

^^^^ Phf- Asn Met Met Leu Gly Aia bei 
Gly Asn Asn Ala Pne Asn mc^ 

P„ olu TV. His CVS .sp T.. TV. «| TV. LvS 

.eu T.. TV. Ue Zl Cin .sp Se. se. .iu .vs ciV T.. Oiu 
3iv S„ ..e "p .sp S„ .he .s„ L.u se. ..u P.o zie 

,.V v,i .vs Ciu .vs ..e s„ .sp CVS .s„ .sp P., se. TV. .sp 

T ttp. Ara Asn Asp Pro Lys Cys 
^eu Thr Leu Ser Tyr Val Pro Asp Leu He Arg Asn 

850 ^, XrD Glu Thr Tyr Ala Asn 

Thr Thr Ala Leu Val He Ser Gly Ala Ser Trp 

865 , „, Ala Gly Ser His. Tyr Ala 

Asn Leu Ala Arg Gin Ala Leu Gin Val Arg 

P.e ser Pro Me. P^e Glu Val Leu Gly Gin Phe Val Phe Glu Val Arg 
ser ser Z He Tyr Asn Val Asp Leu Gly Gly Lys Phe Gin Phe 
915 ^'^ 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3052 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

^^^^^^^^^^^^^^^^^^ 
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GATGTCTCAA TATCTAACGT CGATAACTCT GCATTAAATA AAGCCTGCTT CAATGTOACC 
TCAGGAAGTG TGACGTTCGC AGGAAATCAT CATGGGTTAT ATTTTAATAA TATTTCCTCA 
GGAACTACAA AGGAAGGGGC TGTACTTTGT TGCCAAGATC CTCAAGCAAC GGCACCTTTT 
TCTGGGTTCT CCACGCTCTC TTTTATTCAG AGCCCCGGAG ATATTAAAGA ACAGGGATGT 
CTCTATTCAA AAAATGCACT TATGCTCTTA AACAATTATG TAGTGCGTTT TGAACAAAAC 
CAAAGTAAGA CTAAAGGCGG AGCTATTAGT GGGGCGAATG TTACTATAGT AGGCAACTAC 
GATTCCGTCT CTTTCTATCA GAATGCAGCC ACTTTTGGAG GTGCTATCCA TTCTTCAGGT 
CCCCTACAGA TTGCAGTAAA TCAGGCAGAG ATAAGATTTG CACAAAATAC TGCCAAGAAT 
GGTTCTGGAG GGGCTTTGTA CTCCGATGGT GATATTGATA TTGATCAGAA TGCTTATGTT 
CTATTTCGAG AAAATGAGGC ATTGACTACT GCTATAGGTA AGGGAGGGGC TGTCTGTTGT 
CTTCCCACTT CAGGAAGTAG TACTCCAGTT CCTATTGTGA CTTTCTCTGA CAATAAACAG 
TTAGTCTTTG AAAGAAACCA TTCCATAATG GGTGGCGGAG CCATTTATGC TAGGAAACTT 
AGCATCTCTT CAGGAGGTCC TACTCTATTT ATCAATAATA TATCATATGC AAATTCGCAA 
AATTTAGGTG GAGCTATTGC CATTGATACT GGAGGGGAGA TCAGTTTATC AGCAGAGAAA 
GGAACAATTA CATTCCAAGG AAACCGGACG AGCTTACCGT TTTTGAATGG CATCCATCTT 
TTACAAAATG CTAAATTCCT GAAATTACAG GCGAGAAATG GATGCTCTAT AGAATTTTAT 
GATCCTATTA CTTCTGAAGC AGATGGGTCT ACCCAATTGA ATATCAACGG AGATCCTAAA 
AATAAAGAGT ACACAGGGAC CATACTCTTT TCTGGAGAAA AGAGTCTAGC AAACGATCCT 
AGGGATTTTA AATCTACAAT CCCTCAGAAC GTCAACCTGT CTGCAGGATA CTTAGTTATT 
AAAGAGGGGG CCGAAGTCAC AGTTTCAAAA TTCACGCAGT CTCCAGGATC GCATTTAGTT 
TTAGATTTAG GAACCAAACT GATAGCCTCT AAGGAAGACA TTGCCATCAC AGGCCTCGCG 
ATAGATATAG ATAGCTTAAG CTCATCCTCA ACAGCAGCTG TTATTAAAGC AAACACCGCA 
AATAAACAGA TATCCGTGAC GGACTCTATA GAACTTATCT CGCCTACTGG CAATGCCTAT 
GAAGATCTCA GAATGAGAAA TTCACAGACG TTCCCTCTGC TCTCTTTAGA GCCTGGAGCC 
GGGGGTAGTG TGACTGTAAC TGCTGGAGAT TTCCTACCGG TAAGTCCCCA TTATGGTTTT 
CAAGGCAATT GGAAATTAGC TTGGACAGGA ACTGGAAACA AAGTTGGAGA ATTCTTCTGG 
GATAAAATAA ATTATAAGCC TAGACCTGAA AAAGAAGGAA ATTTAGTTCC TAATATCTTG 
TGGGGGAATG CTGTAAATGT CAGATCCTTA ATGCAGGTTC AAGAGACCCA TGCATCGAGC 
TTACAGACAG ATCGAGGGCT GTGGATCGAT GGAATTGGGA ATTTCTTCCA TGTATCTGCC 
TCCGAAGACA ATATAAGGTA CCGTCATAAC AGCGGTGGAT ATGTTCTATC TGTAAATAAT 
GAGATCACAC CTAAGCACTA TACTTCGATG GCATTTTCCC AACTCTTTAG TAGAGACAAG 
GACTATGCGG TTTCCAACAA CGAATACAGA ATGTATTTAG GATCGTATCT CTATCAATAT 
ACAACCTCCC TAGGGAATAT TTTCCGTTAT GCTTCGCGTA ACCCTAATGT AAACGTCGGG 
ATTCTCTCAA GAAGGTTTCT TCAAAATCCT CTTATGATTT TTCATTTTTT GTGTGCTTAT 
GGTCATGCCA CCAATGATAT GAAAACAGAC TACGCAAATT TCCCTATGGT GAAAAACAGC 
TGGAGAAACA ATTGTTGGGC TATAGAGTGC GGAGGGAGCA TGCCTCTATT GGTATTTGAG 
AACGGAAGAC TTTTCCAAGG TGCCATCCCA TTTATGAAAC TACAATTAGT TTATGCTTAT 
CAGGGAGATT TCAAAGAGAC GACTGCAGAT GGCCGTAGAT TTAGTAATGG GAGTTTAACA 
TCGATTTCTG TACCTCTAGG CATACGCTTT GAGAAGCTGG CACTTTCTCA GGATGTACTC 
TATGACTTTA GTTTCTCCTA TATTCCTGAT ATTTTCCGTA AGGATCCCTC ATGTGAAGCT 
GCTCTGGTGA TTAGCGGAGA CTCCTGGCTT GTTCCGGCAG CACACGTATC AAGACATGCT 
TTTGTAGGGA GTGGAACGGG TCGGTATCAC TTTAACGACT ATACTGAGCT CTTATGTCGA 
GGAAGTATAG AATGCCGCCC CCATGCTAGG AATTATAATA TAAACTGTGG AAGCAAATTT 
CGTTTTTAGA AGGTTTCCAT TGCCTGTGTG GTTCCGGATC TTAACTATAA ATCCTGGACT 
ATGGATCATA GGCATTGGGT TTCTCGAACT TGTGTGGAGA ATAACGACAT TTTATATGCA 
TAACGGAATA CTCGTATCAC CTCAGCCCCT AGAGACATTC TTTAGGGGTT CTTTATTTGT ^^^u 
CTAAACTTCG TATTTTATCG AGAATCCTTT ACGTTCTTGG TTTGCTTGTC TCCGAGGAGT 3000 
TCTCTAACGA ATCATAGGGA TTCCAGGGTT CTGTTCCTTG AGTCCTTTGG CA 3052 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 922 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Arg Phe Ser Leu Cys Gly Phe Pro Leu Val Phe Ser Leu Thr Leu 

15 10 15 

Leu Ser Val Phe Asp Thr Ser Leu Ser Ala Thr Thr He Ser Leu Thr 

20 25 30 

Pro Glu Asp Ser Phe His Gly Asp Ser Gin Asn Ala Glu Arg Ser Tyr 

35 40 45 

Asn Val Gin Ala Gly Asp Val Tyr Ser Leu Thr Gly Asp Val Ser He 

50 55 60 

Ser Asn Val Asp Asn Ser Ala Leu Asn Lys Ala Cys Phe Asn Val Thr 

^0 75 80 

Ser Gly Ser Val Thr Phe Ala Gly Asn His His Gly Leu Tyr Phe Asn 

85 90 95 

Asn He Ser Ser Gly Thr Thr Lys Glu Gly Ala Val Leu Cys Cys Gin 

100 105 110 

Asp Pro Gin Ala Thr Ala Arg Phe Ser Gly Phe Ser Thr Leu Ser Phe 

115 120 125 

He Gin Ser Pro Gly Asp He Lys Glu Gin Gly Cys Leu Tvr Ser Lys 

130 135 140 

Asn Ala Leu Met Leu Leu Asn Asn Tyr Val Val Arg Phe Glu Gin Asn 

1^0 155 160 

Gin Ser Lys Thr Lys Gly Gly Ala He Ser Gly Ala Asn Val Thr He 

165 170 175 

Val Gly Asn Tyr Asp Ser Val Ser Phe Tyr Gin Asn Ala Ala Thr Phe 

190 185 190 

Gly Gly Ala He His Ser Ser Gly Pro Leu Gin He Ala Val Asn Gin 

195 200 205 

Ala Glu He Arg Phe Ala Gin Asn Thr Ala Lvs Asn Glv Ser Gly Gly 

210 215 220 

Ala Leu Tyr Ser Asp Gly Asp He Asp He Asp Gin Asn Ala Tyr Val 

230 235 240 

Leu Phe Arg Glu Asn Glu Ala Leu Thr Thr Ala He Gly Lys Gly Gly 

245 250 ^ 255 

Ala Val Cys Cys Leu Pro Thr Ser Gly Ser Ser Thr Pro Val Pro He 

260 265 270 

Val Thr Phe Ser Asp Asn Lys Gin Leu Val Phe Glu Arg Asn His Ser 

275 280 285 

He Met Gly Gly Gly Ala He Tyr Ala Arg Lys Leu Ser He Ser Ser 

290 295 300 

Gly Gly Pro Thr Leu Phe He Asn Asn He Ser Tyr Ala Asn Ser Gin 

310 315 320 

Asn Leu Gly Gly Ala He Ala He Asp Thr Gly Gly Glu He Ser Leu 

325 330 335 

Ser Ala Glu Lys Gly Thr He Thr Phe Gin Gly Asn Arg Thr Ser Leu 

340 345 35Q 

Pro Phe Leu Asn Gly He His Leu Leu Gin Asn Ala Lvs Phe Leu Lys 

355 360 365 

Leu Gin Ala Arg Asn Gly Cys Ser He Glu Phe Tyr Asp Pro He Thr 

370 375 380 

Ser Glu Ala Asp Gly Ser Thr Gin Leu Asn He Asn Gly Asp Pro Lvs 

390 395 400 
Asn Lys Glu Tyr Thr Gly Thr He Leu Phe Ser Gly Glu Lys Ser Leu 
^05 ^jj] 



Ala Asn Asp Pro Arg Asp Phe Lys Ser Thr He Pro Gin Asn Val Asn 
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420 425 430 

Leu Ser Ala Gly Tyr Leu Val lie Lys Glu Gly Ala Glu Val Thr Val 

435 440 445 

Ser Lys Phe Thr Gin Ser Pro Gly Ser His Leu Val Leu Asp Leu Gly 

450 455 460 

Thr Lys Leu lie Ala Ser Lys Glu Asp He Ala He Thr Gly Leu Ala 
465 470 475 480 

He Asp He Asp Ser Leu Ser Ser Ser Ser Thr Ala Ala Val He Lys 

485 490 495 

Ala Asn Thr Ala Asn Lys Gin He Ser Val Thr Asp Ser He Glu Leu 

500 505 510 

He Ser Pro Thr Gly Asn Ala Tyr Glu Asp Leu Arg Met Arg Asn Ser 

515 520 525 

Gin Thr Phe Pro Leu Leu Ser Leu Glu Pro Gly Ala Gly Gly Ser Val 

530 535 540 

Thr Val Thr Ala Gly Asp Phe Leu Pro Val Ser Pro His Tyr Gly Phe 
545 550 555 560 

Gin Gly Asn Trp Lys Leu Ala Trp Thr Gly Thr Gly Asn Lys Val Gly 

565 570 575 

Glu Phe Phe Trp Asp Lys He Asn Tyr Lys Pro Arg Pro Glu Lys Glu 

580 585 590 

Gly Asn Leu Val Pro Asn He Leu Trp Gly Asn Ala Val Asn Val Arg 

595 600 605 

Ser Leu Met Gin Val Gin Glu Thr His Ala Ser Ser Leu Gin Thr Asp 

610 615 620 

Arg Gly Leu Trp He Asp Gly He Gly Asn Phe Phe His Val Ser Ala 
625 630 635 640 

Ser Glu Asp Asn He Arg Tyr Arg His Asn Ser Gly Gly Tyr Val Leu 

645 650 655 

Ser Val Asn Asn Glu He Thr Pro Lys His Tyr Thr Ser Met Ala Phe 

660 665 670 

Ser Gin Leu Phe Ser Arg Asp Lys Asp Tyr Ala Val Ser Asn Asn Glu 

675 680 685 

Tyr Arg Met Tyr Leu Gly Ser Tyr Leu Tyr Gin Tyr Thr Thr Ser Leu 

690 695 700 

Gly Asn He Phe Arg Tyr Ala Ser Arg Asn Pro Asn Val Asn Val Gly 
705 710 715 720 

He Leu Ser Arg Arg Phe Leu Gin Asn Pro Leu Met He Phe His Phe 

725 730 735 

Leu Cys Ala Tyr Gly His Ala Thr Asn Asp Met Lys Thr Asp Tyr Ala 

740 745 750 

Asn Phe Pro Met Val Lys Asn Ser Trp Arg Asn Asn Cys Trp Ala He 

755 760 765 

Glu Cys Gly Gly Ser Met Pro Leu Leu Val Phe Glu Asn Gly Arg Leu 

770 775 780 

Phe Gin Gly Ala He Pro Phe Met Lys Leu Gin Leu Val Tyr Ala Tvr 
785 790 795 800 

Gin Gly Asp Phe Lys Glu Thr Thr Ala Asp Gly Arg Arg Phe Ser Asn 

805 810 815 

Gly Ser Leu Thr Ser He Ser Val Pro Leu Gly He Arg Phe Glu Lys 

820 825 830 

Leu Ala Leu Ser Gin Asp Val Leu Tyr Asp Phe Ser Phe Ser Tyr He 

835 840 845 

Pro Asp He Phe Arg Lys Asp Pro Ser Cys Glu Ala Ala Leu Val He 

850 855 860 

Ser Gly Asp Ser Trp Leu Val Pro Ala Ala His Val Ser Arg His Ala 
865 870 875 880 
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Phe Val Gly Ser Gly Thr Gly Arg Tyr His Phe Asn Asp Tyr Thf~lu 

885 890 895 

Leu Leu Cys Arg Gly Ser He Glu Cys Arg Pro His Ala Arg Asn Tyr 

900 905 910 

Asn He Asn Cys Gly Ser Lys Phe Arg Phe 

915 920 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 526 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



ATGAAGATTC CACTCCGCTT TTTATTGATA TCATTAGTAC CTACGCTTTC TATGTCGAAT 60 

TTATTAGGAG CTGCTACTAC CGAAGAGCTA TCGGCTAGCA ATAGCTTCGA TGGAACTACA 12 0 

TCAACAACAA GCTTTTCTAG TAAAACATCA TCGGCTACAG ATGGCACCAA TTATGTTTTT 180 

AAAGATTCTG TAGTTATAGA AAATGTACCC AAAACAGGGG AAACTCAGTC TACTAGTTGT 240 

TTTAAAAATG ACGCTGCAGC TGGAGATCTA AATTTCTTAG GAGGGGGATT TTCTTTCACA 300 

TTTAGCAATA TCGATGCAAC CACGGCTTCT GGAGCTGCTA TTGGAAGTGA AGCAGCTAAT 360 

AAGACAGTCA CGTTATCAGG ATTTTCGGCA CTTTCTTTTC TTAAATCCCC AGCAAGTACA 420 

GTGACTAATG GATTGGGAGC TATCAATGTT AAAGGGAATT TAAGCCTATT GGATAATGAT 4 80 

AAGGTATTGA TTCAGGACAA TTTCTCAACA GGAGATGGCG GAGCAATTAA TTGTGCAGGC 54 0 

TCCTTGAAGA TCGCAAACAA TAAGTCCCTT TCTTTTATTG GAAATAGTTC TTCAACACGT 600 

GGCGGAGCGA TTCATACCAA AAACCTCACA CTATCTTCTG GTGGGGAAAC TCTATTTCAG 6 60 

GGGAATACAG CGCCTACGGC TGCTGGTAAA GGAGGTGCTA TCGCGATTGC AGACTCTGGC 720 

ACCCTATCCA TTTCTGGAGA CAGTGGCGAC ATTATCTTTG AAGGCAATAC GATAGGAGCT 7 80 

ACAGGAACCG TCTCTCATAG TGCTATTGAT TTAGGAACTA GCGCTAAGAT AACTGCGTTA 84 0 

CGTGCTGCGC AAGGACATAC GATATACTTT TATGATCCGA TTACTGTAAC AGGATCGACA 900 

TCTGTTGCTG ATGCTCTCAA TATTAATAGC CCTGATACTG GAGATAACAA AGAGTATACG 960 

GGAACCATAG TCTTTTCTGG AGAGAAGCTC ACGGAGGCAG AAGCTAAAGA TGAGAAGAAC 1020 

CGCACTTCTA AATTACTTCA AAATGTTGCT TTTAAAAATG GGACTGTAGT TTTAAAAGGT 1080 

GATGTCGTTT TAAGTGCGAA CGGTTTCTCT CAGGATGCAA ACTCTAAGTT GATTATGGAT 114 0 

TTAGGGACGT CGTTGGTTGC AAACACCGAA AGTATCGAGT TAACGAATTT GGAAATTAAT 12 00 

ATAGACTCTC TCAGGAACGG GAAAAAGATA AAACTCAGTG CTGCCACAGC TCAGAAAGAT 12 60 

ATTCGTATAG ATCGTCCTGT TGTACTGGCA ATTAGCGATG AGAGTTTTTA TCAAAATGGC 1320 

TTTTTGAATG AGGACCATTC CTATGATGGG ATTCTTGAGT TAGATGCTGG GAAAGACATC 13 80 

GTGATTTCTG CAGATTCTCG CAGTATAAAT GCTGTACAAT CTCCGTATGG CTATCAGGGA 14 40 

AAGTGGACAA TCAATTGGTC TACTGATGAT AAGAAAGCTA CGGTTTCTTG GGCAAAGCAA 1500 

AGTTTTAATC CCACTGCTGA GCAGGAGGCT CCGTTAGTTC CTAATCTTCT TTGGGGTTCT 156 0 

TTTATAGATG TTCGTCCCTT CCAAAATTTT ATAGAGCTAG GTACTGAAGG TGCTCCTTAC 1620 

GAAAAGAGAT TTTGGGTTGC AGGCATTTCC AATGTTTTGC ATAGGAGCGG TCGTGAAAAT 1680 

CAAAGGAAAT TCCGTCATGT GAGTGGAGGT GCTGTAGTAG GTGCTAGCAC GAGGATGCCG 174 0 

GGTGGTGATA CCTTGTCTCT GGGTTTTGCT CAGCTCTTTG CGCGTGACAA AGACTACTTT 1800 

ATGAATACCA ATTTCGCAAA GACCTACGCA GGATCTTTAC GTTTGCAGCA CGATGCTTCC 1860 

CTATACTCTG TGGTGAGTAT CCTTTTAGGA GAGGGAGGAC TCCGCGAGAT CCTGTTGCCT 1920 

TATGTTTCCA AGACTCTGGC GTGCTCTTTC TATGGGCAGC TTAGCTACGG CCATACGGAT 1980 

CATCGCATGA AGACCGAGTC TCTACCCCCC CCCCCCCCGA CGCTCTCGAC GGATCATACT 2 04 0 

TCTTGGGGAG GATATGTCTG GGCTGGAGAG CTGGGAACTC GAGTTGCTGT TGAAAATACC 2100 
AGCGGCAGAG GATTTTTCCG AGAGTACACT CCATTTGTAA AAGTCCAAGC TGTTTACTCG 



2160 



rrnrATPTj 9-? on 

TATAACCTTG CGATTCCTCT TGGAATC7U\G TTAGAGAAAC GGTTTGCAGA GCAATATTAT 2 2 80 



wo 98/58953 

PCT/DK98/00266 

50 



™Sc' ^00= ^^^^^^^^^^ --.-,ec 

ATTCrrCAGG cSScG??? ScA^^^J^o ^a^SS^ ^^^^'^^ ACAGGCTGGT 
GGCTTTGAAT GGCGGGGATC TTCTcSSr ^^^^^^^^ CAGAGCTTTT CGGGAACTTT 
TTTTAG nCTCGTAGC TATAATGTAG ATGCGGGTAG CAAAATCAAA 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 841 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 
Met Lys lie Pro Leu Arg Phe Leu Leu He Ser Leu Val Pro Thr Leu 
Ser Met Ser Asn Leu Leu Gly Ma Ala ^hr Thr Glu Glu Leu sL Ala 
ser Asn Ser Phe Asp Gly Thr Thr Ser Thr Thr Ser Phe II Ser Lys 
Thr ser Ser Ala Thr Asp Gly ^hr Asn Tyr Val Phe iys Asp Ser Val 
val lie Glu Asn Val Pro Lys Thr Gly Clu Thr cL Sex Thr Ser Cys 
Phe Lys Asn Asp Ala Ala Ala Gly Asp Leu Isn Phe Leu Gly Gly G^y 
Phe ser Phe Thr Phe Ser Asn He Asp II .hr Thr Ala Ser G^y Ma 
Ala He Gly ser Glu Ala Ala Asn Thr Val Thr Leu Ser Gly Phe 

ser Ala Leu Ser Phe Leu Lys Ser Pro Ala Ser Thr III xhr Asn Gly 
Gly Ma lie Asn Val Lys Gly .sn Leu Ser IZ Leu Asp Asn Asp 
Lys Val Leu lie Oln Asp Asn Phe Ser Thr GlJ Asp Gly Gly Ma III 
Asn cys Ala aiy Se. .eu Lys n, AU .y, s,. Leu lH Phe 

lie aly As„ ser ser Ser T.r .r, 'oil Gly „e „i, 

..u Thr Leu ser Ser 31y aly olu THr Leu Phe Oln ojy .s„ Thr 
Pro Thr »1, oiy Ly, oly Oly .1. „, ,,3 III Ma ..p Ser Cly 

Thr L.U ser lie Ser .ly »sp ser oly ..p HI „, ph, 
Thr Xle Oly Jl, Thr 31y Thr val Ser L^s" Ser „, „e .sp Le" Oly 
Thr ser Al, Lys 11, Thr Ala Leu Arg Al, Ala oln aly ^L" Thr He 
Tyr Phe Tyr Asp Pro Ue Thr Z Thr Gly Ser Thr s^r Val Ala Asp 
Ala Leu As„ lie A„ Ser Pro Asp Thr Gly Asp As^ Lys aiu Tyr Thr 



2340 
2400 
2460 
2520 
2526 
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315 "^"20 

310 



III r.r Ue val Phe s" alV -V. ... =1» Ma Clu .la .y. 
..p =I„ .V. - «^ .eu ..u Gl„ .sn v.. .la P., Lys 

OIV T.. v" val .vs =iy val val ... S,. Ma 31. 

..e se, "1'. .sp Ma .s„ s., .ys 11. Met .sp 31y r.r .e, 

"° III II. Glu Leu Thr asn Leu Glu lie As" 

Leo val Ala Asn Tht Glu ser lie i>iu ^jl^ 

385 "° , „ He Lvs Leu Ser Ala Ala Thr 

He ASP ser Leu Arg Asn Gly Lys Lys lie Lys 

Gl„ LVS ASP fl'e A., lie Asp «| «o Val val Leu Ala lie se. 
.sp Glu se. Ph°e Tyr Gin Asn Gly Phe Leu Asn Glu Asp His Se, Ty. 
,sp Oly III 1.U Glu Leu ASP Tla Gly Lys Asp lie Val lie se. M. 
«p A.e se. ne Asn Ma val Gin Se. P.o Tyr Gly Xy. Gin Gly 
Zl T.P Th. lie Asn rZ se. X., Asp AsP Lys Lys Ala T.r val S„ 
X.P Ala Lys Gin s" Phe Asn Pro XKr Ala Glu Gl„ Glu Ala Pre Leu 
val Pro Asn 11 leu Xrp Gly Ser P.e lie Asp val Ar. Pro P.e Gin 
„n Phe lie Olu Leu Gly x.r Glu Gly M. Pro xyr Glu Lys Arg P.e 
xrp III M. Gly Ue Ser Asn val Leu His Arg Ser Gly Ar, Glu Asn 
rin Arg Lys P., Arg nfs val ser Gly Oly Ala val val Gly Ala ser 
X.r Arg He. Pro Gly Gly Asp x.r Leu Ser Leu Gly Phe Ala Gin Leu 
P.e Ala Arg Z Lys Asp T.r P.e »e. Asn XHr Asn P.e Ala Lys x.r 
xyr Ala Glv ser Leu Arg Leu Gin His Asp Ala Ser Leu xyr Ser Val 

"° , flv Glu Oly Oly Leu Arg Glu He Leu Leu Pro 

Val Ser lie Leu Leu Gly Giu c^iy ^-^y 

Z val ser Lys xhr "eu Pro Gys Ser Phe Tyr Gly Gl„ Leu Ser Xyr 
Gly HIS Xhr Asp hIs Arg Mer Lys X.r Glu Ser Leu Pro Pro Pro Pro 
pro x.r Leu III X.r A.p His X.r Ser Xrp Oly Gly xyr Val xrp Ma 
Gl, Glu lIu Gly Xhr Arg val Ala val Glu Asn XKr Ser Gly Arg Gly 
PHe IZ Ar, Olu xyr x.r Pro P., val Lys val Gin Ala val xyr ser 
Zl Gl„ ASP ser P.e III Olu Leu Gly Ma 11. Ser Arg Asp PHe ser 
.sp ser HIS L.u "yr Asn Leu Ala 11. Pro L.u Gly 11. Lys L.u Glu 

^1" r-,,. rin Xvr xyr His Val VJl Ala IKt Xyr g.r Prn 

Lys Arg Phe Ala Glu Gin xyr lyr ni 

755 
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Asp Val Cys Arg Ser Asn Pro Lys Cys Thr Thr Thr Leu Leu Ser-ff^n 

770 775 780 

Gin Gly Ser Trp Lys Thr Lys Gly Ser Asn Leu Ala Arg Gin Ala Gly 
785 790 795 800 

He Val Gin Ala Ser Gly Phe Arg Ser Leu Gly Ala Ala Ala Glu Leu 

805 810 815 

Phe Gly Asn Phe Gly Phe Glu Trp Arg Gly Ser Ser Arg Ser Tyr Asn 

820 825 830 

Val Asp Ala Gly Ser Lys He Lys Phe 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

ATGAAGTCTT CTTTCCCCAA GTTTGTATTT TCTACATTTG CTATTTTCCC TTTGTCTATG 6 0 

ATTGCTACCG AGACAGTTTT GGATTCAAGT GCGAGTTTCG ATGGGAATAA AAATGGTAAT 120 

TTTTCAGTTC GTGAGAGTCA GGAAGATGCT GGAACTACCT ACCTATTTAA GGGAAATGTC 180 

ACTCTAGAAA ATATTCCTGG AACAGGCACA GCAATCACAA AAAGCTGTTT TAACAACACT 24 0 

AAGGGCGATT TGACTTTCAC AGGTAACGGG AACTCTCTAT TGTTCCAAAC GGTGGATGCA 300 

GGGACTGTAG CAGGGGCTGC TGTTAACAGC AGCGTGGTAG ATAAATCTAC CACGTTTATA 3 60 

GGGTTTTCTT CGCTATCTTT TATTGCGTCT CCTGGAAGTT CGATAACTAC CGGCAAAGGA 4 20 

GCCGTTAGCT GCTCTACGGG TAGCTTGAAG TTTGACAAAA ATGTCAGTTT GCTCTTCAGC 4 80 

AAAAACTTTT CAACGGATAA TGGCGGTGCT ATCACCGCAA AAACTCTTTC ATTAACAGGG 54 0 

ACTACAATGT CAGCTCTGTT TTCTGAAAAT ACCTCCTCAA AGAAAGGCGG AGCCATTCAG 600 

ACTTCCGATG CCCTTACCAT TACTGGAAAC CAAGGGGAAG TCTCTTTTTC TGACAATACT 660 

TCTTCGGATT CTGGAGCTGC AATTTTTACA GAAGCCTCGG TGACTATTTC TAATAATGCT 720 

AAAGTTTCCT TTATTGACAA TAAGGTCACA GGAGCGAGCT CCTCAACAAC GGGGGATATG 7 80 

TCAGGAGGTG CTATCTGTGC TTATAAAACT AGTACAGATA CTAAGGTCAC CCTCACTGGA 84 0 

AATCAGATGT TACTCTTCAG CAACAATACA TCGACAACAG CGGGAGGAGC TATCTATGTG 900 

AAAAAGCTCG AACTGGCTTC CGGAGGACTT ACCCTATTCA GTAGAAATAG TGTCAATGGA 960 

GGTACAGCTC CTAAAGGTGG AGCCATAGCT ATCGAAGATA GTGGGGAATT GAGTTTATCC 1020 

GCCGATAGTG GTGACATTGT CTTTTTAGGG AATACAGTCA CTTCTACTAC TCCTGGGACG 1080 

AATAGAAGTA GTATCGACTT AGGAACGAGT GCAAAGATGA CAGCTTTGCG TTCTGCTGCT 1140 

GGTAGAGCCA TCTACTTCTA TGATCCCATA ACTACAGGAT CTTCCACAAC AGTTACAGAT 1200 

GTCTTAAAAG TTAATGAGAC TCCGGCAGAT TCTGCACTAC AATATACAGG GAACATCATC 1260 

TTCACAGGAG AAAAGTTATC AGAGACAGAG GCCGCAGATT CTAAAAATCT TACTTCGAAG 13 20 

CTACTACAGC CTGTAACTCT TTCAGGAGGT ACTCTATCTT TAAAACATGG AGTGACTCTG 13 80 

CAGACTCAGG CATTCACTCA ACAGGCAGAT TCTCGTCTCG AAATGGACGT AGGAACTACT 14 40 

CTAGAACCTG CTGATACTAG CACCATAAAC AATTTGGTCA TTAACATCAG TTCTATAGAC 1500 

GGTGCAAAGA AGGCAAAAAT AGAAACCAAA GCTACGTCAA AAAATCTGAC TTTATCTGGA 1560 

ACCATCACTT TATTGGACCC GACGGGCACG TTTTATGAAA ATCATAGTTT AAGAAATCCT 1620 

CAGTCCTACG ACATCTTAGA GCTCAAAGCT TCTGGAACTG TAACAAGCAC CGCAGTGACT 1680 

CCAGATCCTA TAATGGGTGA GAAATTCCAT TACGGCTATC AGGGAACTTG GGGCCCAATT 174 0 

GTTTGGGGGA CAGGGGCTTC TACGACTGCA ACCTTCAACT GGACTAAAAC TGGCTATATT 1800 

CCTAATCCCG AGCGTATCGG CTCTTTAGTC CCTAATAGCT TATGGAATGC ATTTATAGAT 1860 

ATTAGCTCTC TCCATTATCT TATGGAGACT GCAAACGAAG GGTTGCAGGG AGACCGTGCT 1920 

TTTTGGTGTG CTGGATTATC TAACTTCTTC CATAAGGATA GTACAAAAAC ACGACGCGGG 1980 

TTTCGCCATT TGAGTGGCGG TTATGTCATA GGAGGAAACC TACATACTTG TTCAGATAAG 2 04 0 
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ATTCTTAGTG CTGCATTTTG TCAGCTCTTT GGAAGAGATA GAGACTACTT TGTAGCTAAG 2100 

AATCAAGGTA CAGTCTACGG AGGAACTCTC TATTACCAGC ACAACGAAAC CTATATCTCT 2160 

CTTCCTTGCA AACTACGGCC TTGTTCGTTG TCTTATGTTC CTACAGAGAT TCCTGTTCTC 2 2 20 

TTTTCAGGAA ACCTTAGCTA CACCCATACG GATAACGATC TGAAAACCAA GTATACAACA 2 2 80 

TATCCTACTG TTAAAGGAAG CTGGGGGAAT GATAGTTTCG CTTTAGAATT CGGTGGAAGA 2 3 40 

GCTCCGATTT GCTTAGATGA AAGTGCTCTA TTTGAGCAGT ACATGCCCTT CATGAAATTG 2 4 00 

CAGTTTGTCT ATGCACATCA GGAAGGTTTT AAAGAACAGG GAACAGAAGC TCGTGAATTT 2 4 60 

GGAAGTAGCC GTCTTGTGAA TCTTGCCTTA CCTATCGGGA TCCGATTTGA TAAGGAATCA 2 520 

GACTGCCAAG ATGCAACGTA CAATCTAACT CTTGGTTATA CTGTGGATCT TGTTCGTAGT 2 580 

AACCCCGACT GTACGACAAC ACTGCGAATT AGCGGTGATT CTTGGAAAAC CTTCGGTACG 264 0 

AATTTGGCAA GACAAGCTTT AGTCCTTCGT GCAGGGAACC ATTTTTGCTT TAACTCAAAT 2 700 

TTTGAAGCCT TTAGCCAATT TTCTTTTGAA TTGCGTGGGT CATCTCGCAA TTACAATGTA 
GACTTAGGAG CAAAATACCA ATTCTAA 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 928 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Lys Ser Ser Phe Pro Lys Phe Val Phe Ser Thr Phe Ala lie Phe 

I 5 10 15 

Pro Leu Ser Met He Ala Thr Glu Thr Val Leu Asp Ser Ser Ala Ser 

20 25 30 

Phe Asp Gly Asn Lys Asn Gly Asn Phe Ser Val Arg Glu Ser Gin Glu 

35 40 45 

Asp Ala Gly Thr Thr Tyr Leu Phe Lys Gly Asn Val Thr Leu Glu Asn 

SO 55 60 

He Pro Gly Thr Gly Thr Ala He Thr Lys Ser Cys Phe Asn Asn Thr 

70 75 80 

Lys Gly Asp Leu Thr Phe Thr Gly Asn Gly Asn Ser Leu Leu Phe Gin 

85 90 95 

Thr Val Asp Ala Gly Thr Val Ala Gly Ala Ala Val Asn Ser Ser Val 

100 105 110 

Val Asp Lys Ser Thr Thr Phe He Gly Phe Ser Ser Leu Ser Phe He 

120 125 

Ala Ser Pro Gly Ser Ser He Thr Thr Gly Lys Gly Ala Val Ser Cys 

130 135 

Ser Thr Gly Ser Leu Lys Phe Asp Lys Asn Val Ser Leu Leu Phe Ser 
150 155 160 

Lys Asn Phe Ser Thr Asp Asn Gly Gly Ala He Thr Ala Lys Thr Leu 

165 170 175 

Ser Leu Thr Gly Thr Thr Met Ser Ala Leu Phe Ser Glu Asn Thr Ser 

180 185 190 

Ser Lys Lys Gly Gly Ala He Gin Thr Ser Asp Ala Leu Thr He Thr 

195 200 205 

Gly Asn Gin Gly Glu Val Ser Phe Ser Asp Asn Thr Ser Ser Asp Ser 

210 215 220 

- i^lTr 7^1^ yio Dh q Thr Glu Ala S er Val Thr He Ser Asn Asn Ala 

235 2^ _ o ^n 

Lys Val Ser Phe He Asp Asn Lys Val Thr Gly Ala Ser Ser Ser Thr 



2760 
2787 
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Thr Gly Asp Met Ser Gly Gly Ala il. f " 255— 

260 ^ Cys Ala Tyr Lys Thr Ser Thr 

Asp Thr Lys Val Thr Leb Thr Glv L ^70 
Asn Th ' 0 Ser Asn 

Asn Thr Ser Thr Thr Ala Gly Gly Ala He Tyr Val f'' 

ser Gly Gly .eu ^h" . 
- - Ala .y3 Gly Gly Ala lie Al -%1. ^ 

- 7 vai Gly - ... 

Val Thr Ser Thr Thr Pro Gly Thr A^^ . ^50 
35S ^ "^^^ Arg Ser Se^r t, 

™r se, He. XH. T". " 

Tyr Phe Tyr Asp Pro tk 

val Leu Ly, Val »S 

405 ^^'^ Ala Leu Gin Tyr Thr 

Gly Asn lie lie Phe Thr Gly gIu Lvs 

420 ^ Leu Ser Glu Thr Glu Ala a, 

ser Lys Asn Leu Thr Ser Lvs Leu . 
Gi. 440 Thr Leu Ser 

- - - Gly val Thr Leu Thr Gl. Ala 

Phe Thr Gin Gin Ala A«,. I 4 60 

ser ^ 

I-u ai. P„ 475 ^Jr 

485 Leu Val He Asn 

ser ser lie Asp Gly Ala Lvs Lys Ala 111 n 

ser Lv. . ' 5^5 '^ia Thr 

Lys Asn Leu Thr Leu Ser Cly Thr He Thr Leu Le I'' 
Glv Th^ I: 520 ^^"^ Pro Thr 

C^-Ly Thr Phe Tyr Glu Asn His s^r- r ^25 

5 <^ln Ser Tyr Asp 

He Leu Glu Leu Lys Ala S^,- ^40 ^ 

S50 -^h- ser Thr Ala Val Thr 

Pro ASP Pro He Met Gly Glu Lys Phe His J!' 

565 ^ '^^ ^r Gly Tyr Gin Gly Thr 

Gly Pro lie val Trp Gly Thr Glv c 

580 ^ ^ Ala Ser Thr Thr Ala Thr Phe 

Asn Trp Thr Lys Thr Gly Tyr He Pro . 
Leu val . 

- Pro Asn ser Leu T a. Ala Phe He Asp ^ .r Ser Leu 
His Tyr Leu Met Glu Thr Al. n ^20 

625 '^^ Asn Glu Gly Leu rir, oi , 

Phe Trx. n "° fi^^ ^ ^^P Ala 

Phe Trp cys Ala Gly Leu Ser Asn Phe Phe Hi! r 

Th « ^'^^ ""^P Ser Thr Lys 

Thr Arg Arg Gly Phe Arg His Leu Ser Gly Gl . 

660 ^er Gly Gly Tyr Val He Gly Glv 

Asn Leu His Thr Cys Ser Asp Lys iL c 

Leu Phe r.'' ""'^ ^>'^ 

Leu Phe Gly Arg Asp Arg Asd T^/r- dv, ^^5 

9 Asp Tyr Phe Val Ala Lys Asn Gin Gly Thr 
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Val 


Tyr 


Gly 


Gly 


Thr 


Leu 


Tyr 


Tyr 


Gin 


His 


Asn 


Glu 


Thr 


Tvr 


He 


Ser 


705 










710 










715 








7 7 0 


Leu 


Pro 


Cys 


Lys 


Leu 
725 


Arg 


Pro 


Cys 


Ser 


Leu 
730 


Ser 


Tyr 


Val 


Pro 


Thr 
735 


Glu 


lie 


Pro 


Val 


Leu 


Phe 


Ser 


Gly 


Asn 


Leu 


Ser 


Tyr 


Thr 


His 


Thr 


Asp 


Asn 








740 










745 










750 




Asp 


Leu 


Lys 


Thr 


Lys 


Tyr 


Thr 


Thr 


Tyr 


Pro 


Thr 


Val 


Lys 


Gly 


Ser 


Trp 






755 










760 










765 






Gly 


Asn 


Asp 


Ser 


Phe 


Ala 


Leu 


Glu 


Phe 


Gly 


Gly Arg Ala 


Pro 


He 


Cys 




770 










775 










780 








Leu 


Asp 


Glu 


Ser 


Ala 


Leu 


Phe 


Glu 


Gin 


Tyr 


nt: L 


ir I (J 


Fne 


Met 


Lys 


Leu 


785 










790 










795 








800 


Gin 


Phe 


Val 


Tyr 


Ala 


His 


Gin 


Glu 


Gly 


Phe 


Lys 


Glu 


Gin 


Gly Thr Glu 










805 










810 










815 




Ala 


Arg 


Glu 


Phe 
820 


Gly 


Ser 


Ser 


Arg 


Leu 
825 


Val 


Asn 


Leu 


Ala 


Leu 
830 


Pro 


He 


Gly 


He 


Arg 


Phe 


Asp 


Lys 


Glu 


Ser 


Asp 


Cys 


Gin 


Asp 


Ala 


Thr 


Tyr 


Asn 






835 










840 










845 






Leu 


Thr 


Leu 


Gly 


Tyr 


Thr 


Val 


Asp 


Leu 


Val 


Arg 


Ser 


Asn 


Pro 


Asp 


Cys 




850 










855 










860 






Thr 


Thr 


Thr 


Leu 


Arg 


He 


Ser 


Gly 


Asp 


Ser 


Trp 


Lys 


Thr 


Phe 


Gly Thr 


865 










870 










875 










880 


Asn 


Leu 


Ala 


Arg 


Gin 


Ala 


Leu 


Val 


Leu 


Arg 


Ala 


Gly 


Asn 


His 


Phe 


Cys 










885 










890 










895 


Phe 


Asn 


Ser 


Asn 


Phe 


Glu 


Ala 


Phe 


Ser 


Gin 


Phe 


Ser 


Phe 


Glu 


Leu 


Arg 








900 










905 










910 




Gly 


Ser 


Ser 
915 


Arg 


Asn 


Tyr 


Asn 


Val 
920 


Asp 


Leu 


Gly 


Ala 


Lys 
925 


Tyr 


Gin 


Phe 



(2) INFORMATION FOR SEQ ID NO : 11 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 757 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



ATGAGATCGT 
AGTGTTTCTG 
ACAAGCACCA 
ATTCTCGATG 
TGTTTTTCTA 
GACAATATTA 
ATTACGAAAT 
AAAGGAGCCA 
AATGAAAATG 
GGGAGTACGC 
TATGCTTCTG 
AGTGCGACAA 
CAAAATATCT 
AAAGCAGGGG 



CTTTTTCCTT 
CAGATGCTGC 
CAGAATTTAC 
GGGATGTCTC 
ACACTGCAGG 
TTTCGTCTAC 
TCTCAGGATT 
TTAAAATTAC 
CCTCTAGTGA 
GGTTTGTAGC 
GTGACTCTGT 
CATCAGGAGG 
TTTTCGATGG 
CGAACCCAGA 



GTTATTAATA 
CGATCTCACA 
TCCTAAAGCG 
GATAAGCCAA 
AAATCTTACC 
TGTTGCAGGT 
TTCAACTCTT 
CGATGGTCTG 
AAATGGGGGA 
GTTCCTTGGC 
GATTTCTGAG 
CGCGATCTCT 
CTGCAAAGCA 
CCCTATCTTG 



TCTTCATCTC 
TTAGGGAGTC 
GCAACTTCTG 
GCAGGGAAAC 
TTCTTAGGGA 
GTTGTTGTTA 
CGGATGCTTG 
GTGTTTGAGA 
GCCATCAATA 
AATAGCTCGT 
AATGCAGGAA 
GCTGAAGGGA 
ACTACAAATG 
ACTCTTTCAG 



TAGCCTTTCC 
GTGACAGTTA 
ATGCTAGTGG 
AAACGAGCTT 
ACGGATTTTC 
GCAATACAGC 
CAGCTCCTAG 
GTATAGGGAA 
CGAAGACTTT 
CGCAACAAGG 
TCTTGAGCTT 
ACCTTGTGAT 
GCGGAGCTAT 
GAAATGAGAG 



TCTCTTAATG 
TAATGGTGAT 
CACGACCTAT 
AACCACAAGT 
TCTTCATTTT 
AGCTTCTGGG 
GACCACAGGT 
TCTTGACCAA 
GTCTTTGACT 
GGGAGCGATC 
CGGAAACAAC 
CTCCAATAAC 
TGATTGTAAC 
CCTGCATTTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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GGGGCA.^TTG CGATTCTAGA TTCTGGAGAG ATTAGCATTT CTGCAGATCT CGGCACTATP 

sss src?^- ~i 
= ~ ii i = 

= = — s = ~ ~i 
s?s s^— c= — - 1— E 

ATTAAGGCGA CGGCAGCAAG TAAGGATGTT GCCTTATCAG GGcSJ^^^ S^SI^I^ 
GCTCAGGGGA ACTATTATGA GCATCATAAT CTCAGTCAAC AGCaStc^? ?cS??AAiI 

™Sc ?™ SriE 
^ssj? SIS ~- iiE 
= ~ = is= dii i ii 

11^-^ — s ™j -™ 

= ^i-- ~i iii 

=2S ™s ~ i i 

™- Sii s= 
= ~ ~ ~ isE 

(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 918 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12 



Met 


Arg 


Ser 


Ser 


Phe 


Ser 


1 








5 




Pro 


Leu 


Leu 


Met 
20 


Ser 


Val 


Ser 


Arg 


Asp 

35 


Ser 


Tyr 


Asn 


Lys 


Ala 


Ala 


Thr 


Ser 


Asp 




50 








Asp 


Val 


Ser 


lie 


Ser 


Gin 


65 










70 


Cys 


Phe 


Ser 


Asn 


Thr 
85 


Ala 


Ser 


Leu 


His 


Phe 
100 


Asp 


Asn 



55 



Leu 


Leu 


He 


Ser 


Ser 


Ser 


Leu 


Ala 


Phe 


Ala 




10 










15 




Asp 
25 


Ala 


Ala 


Asp 


Leu 


Thr 
30 


Leu 


Gly 


Asp Thr 


Ser 


Thr 


Thr 


Glu 


Phe 


Thr 


Pro 


40 










45 






Ser 


Gly 


Thr 


Thr 


Tyr 


He 


Leu 


Asp 


Gly 










60 






Gly 


Lys 


Gin 


Thr 


Ser 


Leu 


Thr 


Thr 


Ser 


Asn 






75 










80 


Leu 


Thr 


Phe 


Leu 


Gly 


Asn 


Gly 


Phe 


lie 




90 










95 




Ser 
105 


Ser 


Thr 


Val 


Ala 


Gly 
110 


Val 


Val 



1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2757 
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Val Ser Asn Thr Ala Ala Ser Gly He Thr Lys Phe Ser Gly Phe Ser 

115 120 125 

Thr Leu Arg Met Leu Ala Ala Pro Arg Thr Thr Gly Lys Gly Ala He 

130 135 140 

Lys He Thr Asp Gly Leu Val Phe Glu Ser He Gly Asn Leu Asp Gin 
145 150 155 160 

Asn Glu Asn Ala Ser Ser Glu Asn Gly Gly Ala He Asn Thr Lys Thr 

165 170 175 

Leu Ser Leu Thr Gly Ser Thr Arg Phe Val Ala Phe Leu Gly Asn Ser 

180 185 190 

Ser Ser Gin Gin Gly Gly Ala He Tyr Ala Ser Gly Asp Ser Val He 

195 200 20S 

Ser Glu Asn Ala Gly He Leu Ser Phe Gly Asn Asn Ser Ala Thr Thr 

210 215 220 

Ser Gly Gly Ala He Ser Ala Glu Gly Asn Leu Val He Ser Asn Asn 
225 230 235 240 

Gin Asn He Phe Phe Asp Gly Cys Lys Ala Thr Thr Asn Gly Gly Ala 

245 250 255 

He Asp Cys Asn Lys Ala Gly Ala Asn Pro Asp Pro He Leu Thr Leu 

260 265 270 

Ser Gly Asn Glu Ser Leu His Phe Leu Asn Asn Thr Ala Gly Asn Ser 

275 280 285 

Gly Gly Ala He Tyr Thr Lys Lys Leu Val Leu Ser Ser Gly Arg Gly 

290 295 300 

Gly Val Leu Phe Ser Asn Asn Lys Ala Ala Asn Ala Thr Pro Lys Gly 
305 310 315 320 

Gly Ala He Ala He Leu Asp Ser Gly Glu He Ser He Ser Ala Asp 

325 330 335 

Leu Gly Asn He He Phe Glu Gly Asn Thr Thr Ser Thr Thr Gly Ser 

340 345 350 

Pro Ala Ser Val Thr Arg Asn Ala He Asp Leu Ala Ser Asn Ala Lys 

355 360 365 

Phe Leu Asn Leu Arg Ala Thr Arg Gly Asn Lys Val He Phe Tyr Asp 

370 375 380 

Pro He Thr Ser Ser Gly Ala Thr Asp Lys Leu Ser Leu Asn Lys Ala 
385 390 395 400 

Asp Ala Gly Ser Gly Asn Thr Tyr Glu Gly Tyr He Val Phe Ser Gly 

405 410 415 

Glu Lys Leu Ser Glu Glu Glu Leu Lys Lys Pro Asp Asn Leu Lys Ser 

420 425 430 

Thr Phe Thr Gin Ala Val Glu Leu Ala Ala Gly Ala Leu Val Leu Lys 

435 440 445 

Asp Gly Val Thr Val Val Ala Asn Thr He Thr Gin Val Glu Gly Ser 

450 455 460 

Lys Val Val Met Asp Gly Gly Thr Thr Phe Glu Ala Ser Ala Glu Gly 
465 470 475 480 

Val Thr Leu Asn Gly Leu Ala He Asn He Asp Ser Leu Asp Gly Thr 

485 490 495 

Asn Lys Ala He He Lys Ala Thr Ala Ala Ser Lys Asp Val Ala Leu 

500 505 510 

Ser Gly Pro He Met Leu Val Asp Ala Gin Gly Asn Tyr Tyr Glu His 

515 520 525 

His Asn Leu Ser Gin Gin Gin Val Phe Pro Leu He Glu Leu Ser Ala 

530 535 540 

Gin Gly Thr Met Thr Thr Thr Asp He Pro Asp Thr Pro He Leu Asn 

[ &5t5i 560 

Thr Thr Asn His Tyr Gly Tyr Gin Gly Thr Gly He He Val Trp Val 
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570 

Asp Asp Ala Thr Ala Lys Thr Lys Asn Ala Thr Leu Thr Trp Thr Lys 



585 



Thr Gly Tyr Lys Pro Asn Pro Glu Arg Gin Gly Pro Leu HI Pro Asn 
Ser Leu Trp Gly Ser Phe Val Asp Val Arg Ser He G^n Ser Leu Met 



615 



620 



ASP Arg ser Thr Ser Ser Leu Ser Ser Ser Thr Asn Leu Trp Val Ser 



Gly He Ala Asp Phe Leu His Glu Asp Gin Lys Glv Asn Gin Arg Ser 

650 ' 

Tyr Arg His Ser Ser Ala Gly Tyr Ala Leu Gly Gly Gly Phe Phe Thr 

660 ' — 



665 



670 



Ala Ser Glu Asn Phe Phe Asn Phe Ala Phe Cys Gin Leu Phe Gly Tyr 
^ , 680 

Asp Lys ASP His Leu Val Ala Lys Asn His Thr His Val Tyr Ala Gly 
Ala Met Ser Tyr Arg His Leu Gly Glu Ser Lys ^hr Leu Ala Lys He 
Leu Ser Gly Asn Ser Asp Ser Leu Pro Phe III Phe Asn Ala Arg Phe 



725 



Ala Tyr Gly His Thr Asp Asn Asn Met Thr Thr Lys Tyr Thr G^y Tyr 
Ser Pro Val Lys Gly Ser Trp Gly Asn Asp Ala Phe Gly lie Glu Cys 
Gly Gly Ala He Pro Val Val Ala Ser Gly Arg Arg Ser Trp Val Asp 



780 



Thr His Thr Pro Phe Leu Asn Leu Glu Met He Tyr Ala His Gin Asn 



795 



Asp Phe Lys Glu Asn Gly Thr Glu Gly Arg S^^ Phe Gin Ser Glu Isp 



805 cnn 



Leu Phe Asn Leu Ala Val Pro Val Gly He Lys Phe Glu Lvs Phe Ser 

825 

Asp Lys Ser Thr Tyr Asp Leu Ser He Ala Tyr Val Pro Asp Val He 

Arg Asn Asp Pro Gly Cys Thr Thr Thr Leu Met Val Ser Gly Asp Ser 

Trp ser Thr Cys Gly Thr Ser Leu Ser Arg Gin Ala Leu Leu Val Arg 

870 87 5 

Ala Gly Asn His His Ala Phe Ala Ser Asn Phe Glu Val Phe Ser Gin 

890 895 
Phe Glu val Glu Leu Arg Gly Ser Ser Arg Ser Tyr Ala He Asp Leu 

you 905 Q 

Gly Gly Arg Phe Gly Phe '^^ 
915 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13 : 
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ATGAAATCCT CTCTTCATTG GTTTGTAATC TCGTCATCTT TAGCACTTCC CTTGTCACTA 
AATTTCTCTG CGTTTGCTGC TGTTGTTGAA ATCAATCTAG GACCTACCAA TAGCTTCTCT 
GGACCAGGAA CCTACACTCC TCCAGCCCAA ACAACAAATG CAGATGGAAC TATCTATAAT 
CTAACAGGGG ATGTCTCAAT CACCAATGCA GGATCTCCGA CAGCTCTAAC CGCTTCCTGC 
TTTAAAGAAA CTACTGGGAA TCTTTCTTTC CAAGGCCACG GCTACCAATT TCTCCTACAA 
AATATCGATG CGGGAGCGAA CTGTACCTTT ACCAATACAG CTGCAAATAA GCTTCTCTCC 
TTTTCAGGAT TCTCCTATTT GTCACTAATA CAAACCACGA ATGCTACCAC AGGAACAGGA 
GCCATCAAGT CCACAGGAGC TTGTTCTATT CAGTCGAACT ATAGTTGCTA CTTTGGCCAA 
AACTTTTCTA ATGACAATGG AGGCGCCCTC CAAGGCAGCT CTATCAGTCT ATCGCTAAAC 
CCCAACCTAA CGTTTGCCAA AAACAAAGCA ACGCAAAAAG GGGGTGCCCT CTATTCCACG 
GGAGGGATTA CAATTAACAA TACGTTAAAC TCAGCATCAT TTTCTGAAAA TACCGCGGCG 660 
AACAATGGCG GAGCCATTTA CACGGAAGCT AGCAGTTTTA TTAGCAGCAA CAAAGCAATT 720 
AGCTTTATAA ACAATAGTGT GACCGCAACC TCAGCTACAG GGGGAGCCAT TTACTGTAGT 
AGTACATCAG CCCCCAAACC AGTCTTAACT CTATCAGACA ACGGGGAACT GAACTTTATA 
GGAAATACAG CAATTACTAG TGGTGGGGCG ATTTATACTG ACAATCTAGT TCTTTCTTCT 
GGAGGACCTA CGCTTTTTAA AAACAACTCT GCTATAGATA CTGCAGCTCC CTTAGGAGGA 960 
GCAATTGCGA TTGCTGACTC TGGATCTTTG AGTCTTTCGG CTCTTGGTGG AGACATCACT 1020 
TTTGAAGGAA ACACAGTAGT CAAAGGAGCT TCTTCGAGTC AGACCACTAC CAGAAATTCT 1080 
ATTAACATCG GAAACACCAA TGCTAAGATT GTACAGCTGC GAGCCTCTCA AGGCAATACT 114 0 
ATCTACTTCT ATGATCCTAT AACAACTAAC CATACTGCAG CTCTCTCAGA TGCTCTAAAC 1200 
TTAAATGGTC CTGACCTTGC AGGGAATCCT GCATATCAAG GAACCATCGT ATTTTCTGGA 
GAGAAGCTCT CGGAAGCAGA AGCTGCAGAA GCTGATAATC TCAAATCTAC AATTCAGCAA 
CCTCTAACTC TTGCGGGAGG GCAACTCTCT CTTAAATCAG GAGTCACTCT AGTTGCTAAG 
TCCTTTTCGC AATCTCCGGG CTCTACCCTC CTCATGGATG CAGGGACCAC ATTAGAAACC 
GCTGATGGGA TCACTATCAA TAATCTTGTT CTCAATGTAG ATTCCTTAAA AGAGACCAAG 
AAGGCTACGC TAAAAGCAAC ACAAGCAAGT CAGACAGTCA CTTTATCTGG ATCGCTCTCT 
CTTGTAGATC CTTCTGGAAA TGTCTACGAA GATGTCTCTT GGAATAACCC TCAAGTCTTT .^.u 
TCTTGTCTCA CTCTTACTGC TGACGACCCC GCGAATATTC ACATCACAGA CTTAGCTGCT 1680 
GATCCCCTAG AAAAAAATCC TATCCATTGG GGATACCAAG GGAATTGGGC ATTATCTTGG 
CAAGAGGATA CTGCGACTAA ATCCAAAGCA GCGACTCTTA CCTGGACAAA AACAGGATAC 
AATCCGAATC CTGAGCGTCG TGGAACCTTA GTTGCTAACA CGCTATGGGG ATCCTTTGTT 
GATGTGCGCT CCATACAACA GCTTGTAGCC ACTAAAGTAC GCCAATCTCA AGAAACTCGC 
GGCATCTGGT GTGAAGGGAT CTCGAACTTC TTCCATAAAG ATAGCACGAA GATAAATAAA 
GGTTTTCGCC ACATAAGTGC AGGTTATGTT GTAGGAGCGA CTACAACATT AGCTTCTGAT .^-.v. 
AATCTTATCA CTGCAGCCTT CTGCCAATTA TTCGGGAAAG ATAGAGATCA CTTTATAAAT 2100 
AAAAATAGAG CTTCTGCCTA TGCAGCTTCT CTCCATCTCC AGCATCTAGC GACCTTGTCT 2160 
TCTCCAAGCT TGTTACGCTA CCTTCCTGGA TCTGAAAGTG AGCAGCCTGT CCTCTTTGAT 2220 
GCTCAGATCA GCTATATCTA TAGTAAAAAT ACTATGAAAA CCTATTACAC CCAAGCACCA 22 80 
AAGGGAGAGA GCTCGTGGTA TAATGACGGT TGCGCTCTGG AACTTGCGAG CTCCCTACCA 2 34 0 
CACACTGCTT TAAGCCATGA GGGTCTCTTC CACGCGTATT TTCCTTTCAT CAAAGTAGAA 2400 
GCTTCGTACA TACACCAAGA TAGCTTCAAA GAACGTAATA CTACCTTGGT ACGATCTTTC 24 60 
GATAGCGGTG ATTTAATTAA CGTCTCTGTG CCTATTGGAA TTACCTTCGA GAGATTCTCG 2 520 
AGAAACGAGC GTGCGTCTTA CGAAGCTACT GTCATCTACG TTGCCGATGT CTATCGTAAG 2 580 
AATCCTGACT GCACGACAGC TCTCCTAATC AACAATACCT CGTGGAAAAC TACAGGAACG 2 64 0 
AATCTCTCAA GACAAGCTGG TATCGGAAGA GCAGGGATCT TTTATGCCTT CTCTCCAAAT 2 7 00 
CTTGAGGTCA CAAGTAACCT ATCTATGGAA ATTCGTGGAT CTTCACGCAG CTACAATGCA 2760 
GATCTTGGAG GTAAGTTCCA GTTCTAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



780 
840 
900 



1260 
1320 
1380 
1440 
1500 
1560 
1620 



1740 
1800 
1860 
1920 
1980 
2040 



2787 



(2) INFORMATION FOR SEQ ID N0:14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 928 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNES S : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: - 
.vs se. se. .e. H.s x.p P.e Va. i.e Se. Ser Se. .eu .Xa .e. 
-o .eu se. .eu .sn PHe Se. ..3 P.e ... va. Va. O.u II .^n 

P.O ... .3n Se. P.e Se. p.o TH. ry. II p.. p.. 

-a 0.n X.. ... .3n .,a .sp O., ... .sn II ... o.. .3p 

Val Se. lie ... .3n .Ja 01. Se. P.o ... .,3 !L ... ,,3 Se. Cys 

P.e ..s C.u ... ... .eu Se. P.e II c.. „.3 

Phe Leu Leu Gin Asn lie Asp Al;, n„ !? . ^5 

Ma .,3 3,. ... se. p.. 3„ 3„ 

^eu c.„ ™. -3 

»S -V »la C.S s,. „e s„ .3„ s„ ^ 

S,, „3„ „a Ill ^„ „^ -0 

Leu Se. Leu Asn P.o Asn Leu ... p.e l^l , , 

180 Lys Ala ... Qin 

Lys Gly Gly Ala Leu Ty. Se. ... Gly Gly He ... n 

195 20 0 '^^^ 

s„ «a se. s„ .3„ ™. Ma I"^ .^n OX, 01, 

Ma „e 31„ .la Se. Se. P.e Ue se. ,,3 Ma xle 

se. PHe „e .3„ Se. „a: x.. „a 

lie Tyr Cys Ser Ser Thr Ser aI:. d 255 

260 l-S" Th. Leu Se. 

..n 01, al„ ^„ .3„ P.e lie 01, .3„ ... «a lie Se. 01, 

OX. Ma lie ... ,3p .3„ .e„ val Se. Se. ^I'J oi, P.„ ... 

^^eu P.e .,3 .3. .3„ se. M. ne .3p X.. .1. 

"a lie Ma lie Ma Se. 01, Se. Leu III .e„ Se. Ma Leu l^y 

lie ... p.e oiu 01, .3„ ... Ill »^ 
se. =1„ ... ... ... 3 „^ 3S0 

-XS ne val oi„ Leu „a Se. 01„ 01, .3„ ... lH 

..o lie ... ... .3. „13 ... .la „a Leu III „3p Ma Leu ,3„ 

.3„ 01, P.O .3p Leu Ma 01, .3. P.„ HI „^ ^J" 

val P.e se. 01, Olu L,3 Leu Se. Olu IL" olu .la Ma olu 111 .3p 
A3„ Leu L,= se. ... lie olu 01„ p.„ Leu ... Leu Ma ^l'° ol, Olu 
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435 










440 










445 








Leu 


Ser 
450 


Leu 


Lys 


Ser 


Gly 


Val 
455 


Thr 


Leu 


Val 


Ala 


Lys 
460 


Ser 


Phe 


Ser 


Gin 


Ser 


Pro 


Gly 


Ser 


Thr 


Leu 


Leu 


Met 


Asp 


Ala 


Gly Thr 


Thr 


Leu 


Glu 


Thr 


465 










470 










475 










480 


Ala 


Asp 


Gly 


He 


Thr 


He 


Asn 


Asn 


Leu 


Val 


Leu 


Asn 


Val 


Asp 


Ser 


Leu 










485 










490 








495 




Lys 


Glu 


Thr 


Lys 
500 


Lys 


Ala 


Thr 


Leu 


Lys 
505 


Ala 


Thr 


Gin 


Ala 


Ser 
510 


Gin 


Thr 


Val 


Thr 


Leu 


Ser 


Gly 


Ser 


Leu 


Ser 


Leu 


Val 


Asp 


Pro 


Ser 


Gly Asn 


Val 






515 










520 










525 








Tyr 


Glu 


Asp 


Val 


Ser 


Trp 


Asn 


Asn 


Pro 


Gin 


Val 


Phe 


Ser 


Cys 


Leu 


Thr 




530 










535 










540 








Leu 


Thr 


Ala 


Asp 


Asp 


Pro 


Ala 


Asn 


He 


His 


He 


Thr 


Asp 


Leu 


Ala 


Ala 


545 










550 










555 








560 


Asp 


Pro 


Leu 


Glu 


Lys 


Asn 


Pro 


He 


His 


Trp 


Gly Tyr 


Gin 


Gly Asn 


Trp 










565 










570 










!>/::> 


Ala 


Leu 


Ser 


Trp 


Gin 


Glu 


Asp Thr Ala 


Thr 


Lys 


Ser 


Lys 


MX d 


H-La 


Thr 








580 










585 










590 






Leu 


Thr 


Trp 


Thr 


Lys 


Thr 


Gly Tyr Asn 


Pro 


Asn 


Pro 


Glu 


Arg 


Arg 


Glv 






595 










600 










605 




Thr 


Leu 


Val 


Ala 


Asn 


Thr 


Leu 


Trp Gly 


Ser 


Phe 


Val 


Asp 


Va 1 


Arg 


Ser 




610 










615 










620 








He 


Gin 


Gin 


Leu 


Val 


Ala 


Thr 


Lys 


Val 


Arg 


Gin 


Ser 


Gin 




X nr 


Ara 


625 










630 










635 










640 


Gly 


He 


Trp 


Cys 


Glu 
645 


Gly 


He 


Ser 


Asn 


Phe 
650 


Phe 


His 


Lys 


As p 


oer 
655 


Thr 


Lys 


He 


Asn 


Lys 
660' 


Gly 


Phe 


Arg 


His 


He 
665 


Ser 


Ala 


Gly 


Tyr 


1 

V d i 

670 


V cl J. 


Glv 


Ala 


Thr 


Thr 


Thr 


Leu 


Ala 


Ser 


Asp Asn 


Leu 


He 


Thr 


Ala 


Ala 


Phe 


Cvs 






675 










680 










685 






Gin 


Leu 


Phe 


Gly 


Lys 


Asp 


Arg 


Asp 


His 


Phe 


He 


Asn 


Lys 


Asn 


Arg 


Ala 




690 










695 










700 








Ser 


Ala 


Tyr 


Ala 


Ala 


Ser 


Leu 


His 


Leu 


Gin 


His 


Leu 


Ala 


Thr 


Leu 


Ser 


705 










710 










715 










720 


Ser 


Pro 


Ser 


Leu 


Leu 
725 


Arg 


Tyr 


Leu 


Pro 


Gly 
730 


Ser 


Glu 


Ser 


Glu 


Gin 
735 


Pro 


Val 


Leu 


Phe 


Asp 
740 


Ala 


Gin 


He 


Ser 


Tyr 
745 


He 


Tyr 


Ser 


Lys 


Asn 
750 


Thr 


Met 


Lys 


Thr 


Tyr 


Tyr 


Thr 


Gin 


Ala 


Pro 


Lys 


Gly 


Glu 


Ser 


Ser 


Trp 


Tyr 


Asn 






755 










760 










765 






Asp Gly 


Cys 


Ala 


Leu 


Glu 


Leu 


Ala 


Ser 


Ser 


Leu 


Pro 


His 


Thr 


Ala 


Leu 




770 










775 










780 










Ser 


His 


Glu 


Gly 


Leu 


Phe 


His 


Ala 


Tyr 


Phe 


Pro 


Phe 


He 


Lys 


Val 


Glu 


785 










790 










795 








800 


Ala 


Ser 


Tyr 


He 


His 
805 


Gin 


Asp 


Ser 


Phe 


Lys 
810 


Glu 


Arg 


Asn 


Thr 


Thr 
815 


Leu 


Val 


Arg 


Ser 


Phe 


Asp Ser 


Gly Asp 


Leu 


He 


Asn 


Val 


Ser 


Val 


Pro 


He 








820 










825 










830 






Gly 


He 


Thr 


Phe 


Glu 


Arg 


Phe 


Ser 


Arg 


Asn 


Glu 


Arg 


Ala 


Ser 


Tyr 


Glu 






835 










840 










845 






Ala 


Thr 


Val 


He 


Tyr 


Val 


Ala 


Asp 


Val 


Tyr 


Arg 


Lys 


Asn 


Pro 


Asp 


Cys 




850 










855 










860 






Thr 


Thr 


Ala 


Leu 


Leu 


He 
870 


Asn 


Asn 


Thr 


Ser 


Trp 
875 


Lys 


Thr 


Thr 


Gly 


Thr 
880 


Asn 


Leu 


Ser 


Arg 


'Gin 
885 


"Ala" 


TTTy" 






890 










895 
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Phe Ser Pro Asn Leu Glu Val Thr Ser Asn Leu Ser Met Glu Ile^mrg 

900 905 910 

Gly Ser Ser Arg Ser Tyr Asn Ala Asp Leu Gly Gly Lys Phe Gin Phe 
91B 920 925 

(2) INFORMATION FOR SEQ ID NO:15: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 2793 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DMA 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATGAAAATAC CCTTGCACAA ACTCCTGATC TCTTCGACTC TTGTCACTCC CATTCTATTG 
AGCATTGCAA CTTACGGAGC AGATGCTTCT TTATCCCCTA CAGATAGCTT TGATGGAGCG 
GGCGGCTCTA CATTTACTCC AAAATCTACA GCAGATGCCA ATGGAACGAA CTATGTCTTA 
TCAGGAAATG TCTATATAAA CGATGCTGGG AAAGGCACAG CATTAACAGG CTGCTGCTTT 
ACAGAAACTA CGGGTGATCT GACATTTACT GGAAAGGGAT ACTCATTTTC ATTCAACACG 
GTAGATGCGG GTTCGAATGC AGGAGCTGCG GCAAGCACAA CTGCTGATAA AGCCCTAACA 
TTCACAGGAT TTTCTAACCT TTCCTTCATT GCAGCTCCTG GAACTACAGT TGCTTCAGGA 

AAAAGTACTT TAAGTTCTGC AGGAGCCTTA AATCTTACCG ATAATGGAAC GATTCTCTTT 4 80 

AGCCAAAACG TCTCCAATGA AGCTAATAAC AATGGCGGAG CGATCACCAC AAAAACTCTT 540 

TCTATTTCTG GGAATACCTC TTCTATAACC TTCACTAGTA ATAGCGCAAA AAAATTAGGT 600 

GGAGCGATCT ATAGCTCTGC GGCTGCAAGT ATTTCAGGAA ACACCGGCCA GTTAGTCTTT 660 

ATGAATAATA AAGGAGAAAC TGGGGGCGGG GCTCTGGGCT TTGAAGCCAG CTCCTCGATT 720 

ACTCAAAATA GCTCCCTTTT CTTCTCTGGA AACACTGCAA CAGATGCTGC AGGCAAGGGC 7 80 

GGGGCCATTT ATTGTGAAAA AACAGGAGAG ACTCCTACTC TTACTATCTC TGGAAATAAA 84 0 

AGTCTGACCT TCGCCGAGAA CTCTTCAGTA ACTCAAGGCG GAGCAATCTG TGCCCATGGT 900 

CTAGATCTTT CCGCTGCTGG CCCTACCCTA TTTTCAAATA ATAGATGCGG GAACACAGCT 960 

GCAGGCAAGG GCGGCGCTAT TGCAATTGCC GACTCTGGAT CTTTAAGTCT CTCTGCAAAT 1020 

CAAGGAGACA TCACGTTCCT TGGCAACACT CTAACCTCAA CCTCCGCGCC AACATCGACA 1080 

CGGAATGCTA TCTACCTGGG ATCGTCAGCA AAAATTACGA ACTTAAGGGC AGCCCAAGGC 114 0 

CAATCTATCT ATTTCTATGA TCCGATTGCA TCTAACACCA CAGGAGCTTC AGACGTTCTG 1200 

ACCATCAACC AACCGGATAG CAACTCGCCT TTAGATTATT CAGGAACGAT TGTATTTTCT 12 60 

GGGGAAAAGC TCTCTGCAGA TGAAGCGAAA GCTGCTGATA ACTTCACATC TATATTAAAG 1320 

CAACCATTGG CTCTAGCCTC TGGAACCTTA GCACTCAAAG GAAATGTCGA GTTAGATGTC 13 80 

AATGGTTTCA CACAGACTGA AGGCTCTACA CTCCTCATGC AACCAGGAAC AAAGCTCAAA 1440 

GCAGATACTG AAGCTATCAG TCTTACCAAA CTTGTCGTTG ATCTTTCTGC CTTAGAGGGA 150O 

AATAAGAGTG TGTCCATTGA AACAGCAGGA GCCAACAAAA CTATAACTCT AACCTCTCCT 1560 

CTTGTTTTCC AAGATAGTAG CGGCAATTTT TATGAAAGCC ATACGATAAA CCAAGCCTTC 1620 

ACGCAGCCTT TGGTGGTATT CACTGCTGCT ACTGCTGCTA GCGATATTTA TATCGATGCG 1680 

CTTCTCACTT CTCCAGTACA AACTCCAGAA CCTCATTACG GGTATCAGGG ACATTGGGAA 1740 

GCCACTTGGG CAGACACATC AACTGCAAAA TCAGGAACTA TGACTTGGGT AACTACGGGC 1800 

TACAACCCTA ATCCTGAGCG TAGAGCTTCC GTAGTTCCCG ATTCATTATG GGCATCCTTT 1860 

ACTGACATTC GCACTCTACA GCAGATCATG ACATCTCAAG CGAATAGTAT CTATCAGCAA 1920 

CGAGGACTCT GGGCATCAGG AACTGCGAAT TTCTTCCATA AGGATAAATC AGGAACTAAC 1980 

CAAGCATTCC GACATAAAAG CTACGGCTAT ATTGTTGGAG GAAGTGCTGA AGATTTTTCT 2 04 0 

GAAAATATCT TCAGTGTAGC TTTCTGCCAG CTCTTCGGTA AAGATAAAGA CCTGTTTATA 2100 

GTTGAAAATA CCTCTCATAA CTATTTAGCG TCGCTATACC TGCAACATCG AGCATTCCTA 2160 

GGAGGACTTC CCATGCCCTC ATTTGGAAGT ATCACCGACA TGCTGAAAGA TATTCCTCTC 2 220 

ATTTTGAATG CCCAGCTAAG CTACAGCTAC ACTAAAAATG ATATGGATAC TCGCTATACT 2 280 

TCCTATCCTG AAGCTCAAGG TTCTTGGACC AATAATTCTG GGGCTCTAGA GCTCGGAGGA 2 34 0 

TCTCTGGCTC TATATCTCCC TAAAGAAGCA CCGTTCTTCC AGGGATATTT CCCCTTCTTA -^4 00 



60 
120 
180 
240 
300 
360 
420 
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2460 
2520 
2580 
2640 
2700 
2760 
2793 



63 

AACTTCCAGG CAGTCTACAG CCGCCAACAA AACTTTAAAG AGAGTGGCGC TGAAGCCCGT 
rCTTTTGATG ATGGAGACCT AGTGAACTGC TCTATCCCTG TCGGCATTCG GTTAGAAAAA 
ATCTC-GAAG ATGAAAAAAA TAATTTCGAG ATTTCTCTAG CCAACATTGG TGATGTGTAT 
tgtaaI^atc CCCGTTCGCG TACTTCTCTA ATGGTCAGTG GAGCCTCTTG GACTTCGCTA 
TC^^AAAACC TCGCACGACA AGCCTTCTTA GCAAGTGCTG GAAGCCATCT gactctctcc 

cctcatgtag aactctctgg ggaagctgct tatgagcttc gtggctcagc acacatctac 
aatgtagatt gtgggctaag atactcattc tag 

(2) information for seq id no: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 930 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Lys He Pro Leu His Lys Leu Leu He Ser Ser Thr Leu Val Thr 

1 5 10 15 

Pro He Leu Leu Ser He Ala Thr Tyr Gly Ala Asp Ala Ser Leu Ser 

20 25 30 

Pro Thr Asp ser Phe Asp Gly Ala Gly Gly Ser Thr Phe Thr Pro Lys 

35 40 45 

ser Thr Ala Asp Ala Asn Gly Thr Asn Tyr Val Leu Ser Gly Asn Val 

50 55 60 

Tyr He Asn Asp Ala Gly Lys Gly Thr Ala Leu Thr Gly Cys Cys Phe 
65 70 75 80 

Thr Glu Thr Thr Gly Asp Leu Thr Phe Thr Gly Lys Gly Tyr Ser Phe 

85 90 , 

Ser Phe Asn Thr Val Asp Ala Gly Ser Asn Ala Gly Ala Ala Ala Ser 

100 105 110 

Thr Thr Ala Asp Lys Ala Leu Thr Phe Thr Gly Phe Ser Asn Leu Ser 

115 120 
Phe He Ala Ala Pro Gly Thr Thr Val Ala Ser Gly Lys Ser Thr Leu 

130 135 140 

Ser Ser Ala Gly Ala Leu Asn Leu Thr Asp Asn Gly Thr He Leu Phe 

150 155 ^^'^ 

Ser Gin Asn Val Ser Asn Glu Ala Asn Asn Asn Gly Gly Ala He Thr 

165 170 175 

Thr Lys Thr Leu Ser He Ser Gly Asn Thr Ser Ser He Thr Phe Thr 

180 185 190 

Ser Asn Ser Ala Lys Lys Leu Gly Gly Ala He Tyr Ser Ser Ala Ala 

195 200 205 

Ala ser He Ser Gly Asn Thr Gly Gin Leu Val Phe Met Asn Asn Lys 

210 215 220 

Gly Glu Thr Gly Gly Gly Ala Leu Gly Phe Glu Ala Ser Ser Ser He 

230 235 
Thr Gin Asn Ser Ser Leu Phe Phe Ser Gly Asn Thr Ala Thr Asp Ala 

245 250 255 

Ala Gly Lys Gly Gly Ala He Tyr Cys Glu Lys Thr Gly Glu Thr Pro 

260 265 270 

Thr- T.^n Thr He Ser Gly Asn Lys Ser Leu Thr Phe Ala Glu Asn Ser 

275 280 
Ser Val Thr Gin Gly Gly Ala He Cys Ala His Gly Leu Asp Leu Ser 
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290 










295 






Ala 


Ala 


Gly 


Pro 


Thr 


Leu 


Phe 


Ser 


Asn 


305 










310 








Ala 


Gly 


Lys 


Gly 


Gly Ala 


He 


Ala 


He 










325 










Leu 


Ser 


Ala 


Asn 


Gin 


Gly Asp 


He 


Thr 








340 










345 


Ser 


Thr 


Ser 


Ala 


Pro 


Thr 


Ser 


Thr 


Arg 






355 










360 


Ser 


Ala 


Lys 


He 


Thr 


Asn 


Leu 


Arg 


Ala 




370 










375 






Phe 


Tyr 


Asp 


Pro 


He 


Ala 


Ser 


Asn 


Thr 


385 










390 








Thr 


He 


Asn 


Gin 


Pro 


Asp 


Ser 


Asn 


Ser 










405 










He 


Val 


Phe 


Ser 


Gly Glu 


Lys 


Leu 


Ser 








420 










4 25 


Asp Asn 


Phe 


Thr 


oe r 


J. xe 


Leu 


Lys 


Gin 






435 










440 




Thr 


Leu 


Ala 


Leu 


Lvs 


Gly 


Asn 


Val 


Glu 




450 










^ c: Q 






Gin 


Thr 


Glu 


Gly 


Ser 


Thr 


Leu 


Leu 


Met 


465 










din 








Ala 


Asp 


Thr 


Glu 


Ala 


J. X t: 


Ser 


Leu 


1 Hi 










4 85 










Ala 


Leu 


Glu 


Gly 


Asn 


Lys 


Ser 


Val 


Ser 








500 










n =^ 

D VJ 3 


Lys 


Thr 


He 


Thr 


Leu 


1 IIL. 


i>er 


Pro 


Leu 






515 










520 




Asn 


Phe 


Tyr 


Glu 


Ser 


His 


Thr 


He 


Asn 




530 










53 5 






Val 


Val 


Phe 


Thr 


Ala 


Ala 


Thr 


Ala 


Ala 


545 










550 








Leu 


Leu 


Thr 


Ser 


Pro 


Val 


Gin 


Thr 


Pro 










565 










Gly His 


Trp 


Glu 


Ala 


Thr 


Trn 


Ala 


Asp 








580 










585 


Thr 


Met 


Thr 


Trp 


Val 


Thr 


Thr 


Gly 


Tvr 






595 










600 


Ala 


Ser 


Val 


Val 


Pro 


Asp 


Ser 


Leu 


Trp 




610 










615 




Thr 


Leu 


Gin 


Gin 


He 


Met 


Thr 


Ser 


Gin 


625 










630 








Arg 


Gly 


Leu 


Trp 


Ala 


Ser 


Gly 


Thr 


Ala 










645 










Ser 


Gly 


Thr 


Asn 


Gin 


Ala 


Phe 


Arg 


His 








660 










665 


Gly 


Gly 


Ser 


Ala 


Glu 


Asp 


Phe 


Ser 


Glu 






675 










680 




Cys 


Gin 


Leu 


Phe 


Gly 


Lys 


Asp 


Lys 


Asp 




690 










695 




Ser 


His 


Asn 


Tyr 


Leu 


Ala 


Ser 


Leu 


Tyr 


705 










710 






Gly Gly 


Leu 


Pro 


Met 


Pro 


Ser 


Phe 


Gly 










725 








Asp 


He 


Pro 


Leu 


He 


Leu 


Asn 


Ala 


Gin 



740 745 
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300 



Asn 


Arg 


Cys 


vjx y 


Asn 


Thr 


Ala 




315 










3 2 0 


Ala 


Asp 


Ser 


Vj J. y 


p 

oer 


Leu 


ber 


330 










T "3 c; 

J J D 




Phe 


Leu 


Gly 


Asn 


i nr 


Leu 


Thr 










3 5 0 






Asn 


Ala 


He 


Tyr 


Leu 


^ji xy 


Ser 








3 65 








Ala 


Gin 


Gly 


V3 i 11 


oer 


i le 


Tyr 






380 










Thr Gly Ala 


C i=k -k^ 

oer 


Asp 


val 


Leu 




1 Q c: 










*} u u 


iri. CJ 


Leu 


Asp 


Tyr 


oer 


Uiy 


Thr 












4 X D 




/-iJLd 


Asp 


iu 


Ala 


Lys 


Ala 


Ala 










4 J u 






c L vj 


Leu 


Ala 


Leu 


Ala 


Ser 


Gly 








A /I 
1 ^ D 








Leu 


Asp 


vax 


Asn 


Gly 


Phe 


Thr 






4 60 










o±n 


Pro 


Gly 


Thr 


Lys 


Leu 


Lys 




/t T C 










480 


Lys 


Leu 


vax 


Val 


Asp 


Leu 


Ser 


/t Q A 










4 95 




T 1 ^ 


J. U 


1 nr 


Ala 


Gly 


Ala 


Asn 










510 






V aX 


Fne 


bin 


Asp 


Ser 


Ser 


Gly 








"5 c: 








Lj-L il 


AX a. 


Pne 


Thr 


Gin 


Pro 


Leu 






540 










Ser 


Asp 


X X e 


Tyr 


He 


Asp 


Ala 




555 










560 


Glu 


Pro 




Tyr 


(j1 y 


Tyr 


Gin 


570 










Q T c; 




Thr 


Ser 


Thr 


AX a 


Lys 


Ser 


Gly 










590 






Asn 


Pro 


Asn 


Pro 


Glu 


Arg 


Arg 








605 








Ala 


Ser 


Phe 


Thr 


Asp 


He 


Arg 






620 










Ala 


Asn 


Ser 


He 


Tyr 


Gin 


Gin 




635 










64 0 


Asn 


Phe 


Phe 


His 


Lys 


Asp 


Lys 


650 










£ c c 
bob 




Lys 


Ser 


Tyr 


Gly Tyr 


1 le 


Val 










670 






Asn 


He 


Phe 


Ser 


Val 


Ala 


Phe 








685 








Leu 


Phe 


He 


Val 


Glu 


Asn 


Thr 






700 










Leu 


Gin 


His 


Arg 


Ala 


Phe 


Leu 




715 










720 


Ser 


He 


Thr 


Asp 


Met 


Leu 


Lys 


730 










735 




Leu 


Ser 


Tyr 


Ser 


Tyr 


Thr 


Lys 



750 
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Asn 


Asp 


Met 


Asp 


Thr 


Arg 


Tyr 


Thr 


Ser 


Tyr 


Pro 


Glu 


Ala 


Gin 


Gly 


Ser 






755 










760 










765 






Trp 


Thr 


Asn 


Asn 


Ser 


Gly 


Ala 


Leu 


Glu 


Leu 


Gly Gly 


Ser 


Leu 


Ala 


Leu 




770 










775 










780 










Tyr 


Leu 


Pro 


Lys 


Glu 


Ala 


Pro 


Phe 


Phe 


Gin 


Gly Tyr 


Phe 


Pro 


Phe 


Leu 


785 










790 










795 










800. 


Lys 


Phe 


Gin 


Ala 


Val 


Tyr 


Ser 


Arg 


Gin 


Gin 


Asn 


Phe 


Lys 


Glu 


Ser 


Gly 










805 










810 










815 


Ala 


Glu 


Ala 


Arg 
820 


Ala 


Phe 


Asp 


Asp 


Gly 
825 


Asp 


Leu 


Val 


Asn 


Cys 
830 


Ser 


lie 


Pro 


Val 


Gly 


He 


Arg 


Leu 


Glu 


Lys 


He 


Ser 


Glu 


Asp 


Glu 


Lys 


Asn 


Asn 






835 










840 










84 5 






Phe 


Glu 


He 


Ser 


Leu 


Ala 


Asn 


He 


Gly 


Asp Val 


Tyr 


Arg 


Lys 


Asn 


Pro 




850 










855 










860 










Arg 


Ser 


Arg 


Thr 


Ser 


Leu 


Met 


Val 


Ser 


Gly Ala 


Ser 


Trp 


Thr 


Ser 


Leu 


865 










870 










875 








880 


Cys 


Lys 


Asn 


Leu 


Ala 


^rg 


Gin 


Ala 


Phe 


Leu 


Ala 


Ser 


Ala 


Gly 


Ser 


His 










885 










890 








895 




Leu 


Thr 


Leu 


Ser 


Pro 


His 


Val 


Glu 


Leu 


Ser 


Gly Glu 


Ala 


Ala 


Tyr 


Glu 








900 










905 










910 




Leu 


Arg 


Gly 
915 


Ser 


Ala 


His 


He 


Tyr 
920 


Asn 


Val 


Asp 


Cys 


Gly 
925 


Leu 


Arg 


Tyr 



Ser Phe 
930 



(2) INFORMATION FOR SEQ ID N0H7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GAAGACAATA TAAGGTACCG TCATAACAGC GGGGGTTATG CACTAGGGAT CACAGCAACA 60 

ACTCCTGCCG AGGATCAGCT TACTTTTGCC TTCTGCCAGC TCTTTGCTAG AGATCGCAAT 12 0 

CATATTACAG GTAAGAACCA CGGAGATACT TACGGTGCCT CTTTGTATTT CCACCATACA 180 

GAAGGGCTCT TCGACATCGC CAATTTCCTC TGGGGAAAAG CAACCCGAGC TCCCTGGGTG 2 40 

CTCTCTGAGA TCTCCCAGAT CATTCCTTTA TCGTTCGATG CTAAATTCAG TTATCTCCAT 3 00 

ACAGACAACC ACATGAAGAC ATATTATACC GATAACTCTA TCATCAAGGG TTCTTGGAGA 3 60 

AACGATGCCT TCTGTGCAGA TCTTGGAGCT AGCCTGCCTT TTGTTATTTC CGTTCCGTAT 4 20 

CTTCTGAAAG AAGTCGAACC TTTTGTCAAA GTACAGTATA TCTATGCGCA TCAGCAAGAC 4 80 

TTCTACGAGC GTCATGCTGA AGGACGCGCT TTCAATAAAA GCGAGCTTAT CAACGTAGAG 54 0 

ATTCCTATAG GCGTCACCTT CGAAAGAGAC TCAAAATCAG AAAAGGGAAC TTACGATCTT 600 

ACTCTTATGT ATATACTCGA TGCTTACCGA CGCAATCCTA AATGTCAAAC TTCCCTAATA 6 60 

GCTAGCGATG CTAACTGGAT GGCCTATGGT ACCAACCTCG CACGACAAGG TTTTTCTGTT 72 0 

CGTGCTGCGA ACCATTTCCA AGTGAACCCC CACATGGAAA TCTTCGGTCA ATTCGCTTTT 780 

GAAGTACGAJV GTTCTTCACG AAATTATAAT ACAAACCTAG GCTCTAAGTT TTGTTTCTAG 84 0 

(2) INFORMATION FOR SEQ ID NO: 18: 



) SEQUENCE CHARACT ERISTICS : 

(A) LENGTH: 27 9 ammo acids 

(B) TYPE: amino acid 
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66 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 

Glu Asp Asn He Arg Tyr Ara Hi = a 
^ ^9 yr Arg His Asn Ser Gly Gly Tyr Ala Leu Gly 

He Thr Ala Thr Thr Pro R^ ^ n . 15 

Ala Glu ASP Gin Leu Thr Phe Ala Phe Cys 

Gin Leu Phe Ala Ara Asn a , ^0 

rg Asp Arg Asn Hxs He Thr Gly Lys Asn His Gly 

Asp Thr Tyr Gly Ala Ser Leu Tvr Ph« u • . ■ 

50 Phe His His Thr Glu Gly Leu Phe 

Asp He Ala Asn Phe Leu Tro ri r ^° 

« 70 "-^^ ""^^ Pro Trp Val 

Leu Ser Glu He Ser Gin He He p^o r . «0 

He Pro Leu Ser Phe Asp Ala Lys Phe 

Ser Tyr Leu His Thr Asp Asn His Met Lys Thr r . " 

100 Thr Tyr Tyr Thr Asp Asn 

Ser He He Lys Glv Spt- T^r. a HO 

Gly ser Trp Arg Asn Asp Ala Phe Cys Ala Asp Leu 

Gly Ala Ser Leu Pro Phe Val He Ser V.I d ^ 

130 Ser Val Pro Tyr Leu Leu Lys Glu 

val Glu Pro Phe Val Lys Val Gin Tv. t i ^ ""^^ 

150 '^'^ ^^"^ Gin Gin Asp 

Phe Tyr Glu Arg His Ala Glu Gly Arq Ala Ph\ " 

16 5 ^ ^ Phe Asn Lys Ser Glu Leu 

He Asn Val Glu He Pro He Gly Val Thr Ph 

180 ^ Thr Phe Glu Arg Asp Ser Lys 

Ser Glu Lys Gly Thr Tyr Asp Leu Th^ r 

200 ""^^ A-P Ala 

Tyr Arg Arg Asn Pro Lys Cvs n o 

210 Thr Ser Leu He Ala Ser Asp Ala 

- Trp Met Ala Tyr Gly Thr Asn Leu Ala Arg III .1, 

Ala Ala Asn His Phe Gin Val Asn Pro '^l .et Glu He Phe ^1^^ 

Gin Phe Ala Phe Glu Val Arg Ser Ser III ^^5 

2 60 ^ ^^"^ ^^,1 Asn Tyr Asn Thr Asn 

Leu Gly Ser Lys Phe Cys Phe ^70 



275 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 1545 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19: 

ATGACCATAC TTCGAAATTT TCTTArrrrr- n^^^ 

TCTTACCTGC TCGGCTTTAT TCCTCGCTCT CCCTGCAGCA 60 
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GCACAAGTTG 
TTAGAACCTA 
AGGATTTCCA 
AATCTTTTTT 
TTTGGCGCTG 
TACTTAACGT 
GGTTCCGTGA 
AGTGGAGCTG 
AATCTCAGCG 
GCCGTATCTA 
CATGCTTATC 
ATCTCTATAT 
GGAAATACAA 
GCTGTTTCAG 
ATTACAGATC 
TTCTCAGGAC 
CTACAAGATG 
CTGCATTCTT 
CTCTGCTCAG 
TTTGTTCCTG 
AAAGTTGCCT 
ACGATTCCTC 
ACTTTGGAGA 
AGCTGGGAAG 
ACTGTTTTCC 



TATATCTTCA 
AAATTACCTG 
ACGTTAAGCA 
TCATGGGCAA 
CCATTTCGAA 
TCACCTCAGC 
TGATCGAAAA 
CGATTTATAC 
GG AACCGCTA 
CCCACAATCT 
ATGACGTGAA 
CCGTGAAAAG 
TACACAACTC 
AATCCGGAGT 
TTGTAATCAA 
TATGCCTGGA 
TCACATTAGC 
TTAAGCAGGA 
GAGATGCTCG 
TAAGGATTCG 
TTGAGGCTTA 
TTCTTGAACT 
GAACCCAAGT 
AGTACCCCCC 
TCACTTGGAA 



TGAAAGTGAT 
TTATCCAGAA 
TGATCAAGAA 
CCGTTGCAAC 
CCGCGTTGGA 
ACCTCTACTA 
TAGTGAGGAA 
TCCCTACCTT 
CCTGGTGTTT 
CACACTCACG 
TAGTAATGGA 
CGGAGATCTC 
CATCCATCTG 
TTATTTCTAT 
TGCTCCTGAA 
TGATCATGAA 
AGGAGGAACT 
AGCAAGCTCT 
GGTTGAGAAT 
CGCCGAGGAC 
TTGGTCCGTC 
TCTAGGGCCT 
CACAACAGAG 
TTCTCTGGAT 
TCCTGAGATC 



GGTTATAACG 
GG AACTTCTT 
GATGCTGGGG 
TTCACTTTTC 
GACACCACTC 
CCTCAAGGAC 
GTGACTTTCT 
TTAGGTTCTA 
AGAGACTATG 
ACTCGAGGAC 
GGAGCCATTG 
ATCTTCAAAG 
CAATCTGGAG 
GATGCTATAA 
GGAAAGGAAA 
GTTTGTGCGG 
CTCTCTCTAT 
ACGCTTACTA 
CTGCACATCC 
AAGGATGCTC 
TATGACTTTC 
TCTTTTGACA 
AATGACGCCG 
AAAGACAGAA 
ACTTCTACGC 



GTGCTATCAA 
ACATCTTTCT 
TTTTTATAAA 
ACAACCTTAT 
TCACTCTCTC 
AAGGAGCGAT 
GTGGGAACTA 
AGGCGAGTCG 
TGAGCCAAGG 
CTTCGTGTTT 
CCATTGCTCC 
GAAATACAGC 
CACAGTTTAA 
GCCATAGCGA 
CTTATGAAGG 
AAAATCTTAC 
CGGATGGGGT 
TGTCTCCAGG 
TGATTGAAGA 
TTGTCTCATT 
CTCAATTTAA 
GTCTTCTCCT 
TTCGAGGTTT 
GGATCACACC 
CATAA 



TAATAAAAGC 
AGATGACGTG 
TCGATCTGGG 
GACCGAGGGT 
TAATTTTTCT 
TTATAGTCTT 
CTCTTCGTGG 
TCCTTCAGTA 
TTATGGCGGC 
TGAAAATAAT 
TGGAGGATGG 
ATCACAAGAC 
GAACCTACGT 
GTCGCATAAA 
AACAATTAGC 
TTCCACAATC 
TACCTTGCAA 
AACCACTCTG 
TACGGACAAC 
AGAAAAACTT 
GGAAGCCTTT 
AGGGGAGACC 
CTGGTCCCTA 
AACTAAGAAA 



(2) INFORMATION FOR SEQ ID NO: 20: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 514 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1545 



(ii) MOLECULE 

(xi) SEQUENCE 

Met Thr lie Leu Arg 

1 5 
Leu Pro Ala Ala Ala 
20 

Asn Gly Ala lie Asn 
35 

Pro Glu Gly Thr Ser 
50 

Val Lys His Asp Gin 
65 

Asn Leu Phe Phe Met 
85 

Met Thr Glu Gly Phe 
100 

Thr Leu Thr Leu Ser 
115 

Leu Leu Pro Gin Gly 



TYPE: peptide 

DESCRIPTION: SEQ ID 

Asn Phe Leu Thr Cys 
10 

Gin Val Val Tyr Leu 
25 

Asn Lys Ser Leu Glu 
40 

Tyr He Phe Leu Asp 
55 

Glu Asp Ala Gly Val 
70 

Gly Asn Arg Cys Asn 
90 

Gly Ala Ala He Ser 
105 

Asn Phe Ser Tyr Leu 
120 

Gin Gly Ala He Tyr 



NO:20 : 

Ser Ala Leu 

His Glu Ser 

Pro Lys He 
45 

Asp Val Arg 
60 

Phe He Asn 
75 

Phe Thr Phe 

Asn Arg Val 

Thr Phe Thr 
125 

Ser Leu Gly 
140 



Phe Leu Ala 
15 

Asp Gly Tyr 
30 

Thr Cys Tyr 

He Ser Asn 

Arg Ser Gly 
80 

His Asn Leu 
95 

Gly Asp Thr 
110 

Ser Ala Pro 
Ser Val Met 
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Ser Gly Ala Ala lie Thr Pro Tyr Leu ^eu Gly Ser Lys Ala T 



170 



r 



Arg Pro Ser Val Asn Leu Ser Gly Asn Arg Tyr Leu Val Phe Arg Asp 

Tyr val Ser Gin Gly Tyr Gly Gly HI Val Ser Thr His As'n Leu Thr 

200 20S 
Leu Thr Thr. Arg Glv Pro Ser p^^q dh^ m ^ 

Tyr His 

220 



Asp val Asn Ser Asn Gly Gly Ala He Ala lie Ala Pro Gly Gly Ser 
He ser He Ser Val Lys Ser Gly Asp Leu lie Phe Lys Gly Asn Thr 
Ala ser Gin Asp Gly Asn Thr He His Asn Ser He His Leu Gin Ser 
Gly Ala Gin Phe Lys Asn Leu Arg Ala Val Ser Glu Ser Gl'y Val Tyr 

Phe Tyr Asp Pro He Ser His S^r Glu Ser His Lys n'e Thr Asp Leu 

3 00 

val He Asn Ala Pro Glu Gly Lys Glu Thr Tyr Glu Gly Thr He Ser 



315 



Phe ser Gly Leu Cys Leu Asp Asp His Glu Val Cys Ala Glu Asn ^eu 



330 



Thr ser Thr lie Leu Gin Asp Val Thr Leu Ala Gly Gly Thr Leu Ser 



335 



Leu ser Asp Gly Val Thr Leu Gin l;; His Ser Phe Lys Gin Glu Ala 

■^^^ 360 



ser ser Thr Leu Thr Met Ser Pro Gly Thr Thr Leu Leu Cys Ser Gly 



ASP Ala Arg Val Gin Asn Leu His He Leu He Glu Asp Thr Asp Asn 



395 



Ph, val P„ val „3 II, ,1, ^00 



410 



Leu Glu Lys Leu Lys Val Ala Phe Glu Ala Tyr Trp Ser Val Tyr Asp 
Phe Pro Gin Phe Lys Glu Ala Phe Thr He Pro Leu Leu Glu Leu Leu 

Gly Pro ser Phe Asp Ser Leu IZ Leu Gly Glu Thr ^hr Leu Glu Arg 

Thr Gin Val Thr Thr Glu Asn Asp Ala Val Arg Gly Phe Trp Ser Leu 

ser Trp Glu Glu Tyr Pro Pro Ser Leu Asp ^^s Asp Arg Arg He Z 

Pro Thr Lys Lys Thr Val Phe Leu Thr Trp Asn Pro Glu He Thr Ser 
Thr Pro 510 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 : 

ATGAAAACGT CTATTCGTAA GTTCTTAATT TCTACCACAC TGGCGCCATG TTTTGCTTCA 
Zl^^rl CTGTAGAAGT TATCATGCCT TCCGAGAACT TTGATGGATC GAGTGGGAAG 
aSJScS? aScaacact TTCTGATCCT AGAGGGACAC TCTGTATTTT TTCAGGGGAT 
SSIcaSg Saatcttga TAATGCCATA TCCAGAACCT CTTCCAGTTG CTTTAGCAAT 
agggcgggIg Sctacaaat CTTAGGAAAA GGTGGGGTTT TCTCCTTCTT AAATATCCGT 
?c??cAG?TG aSSgccgc GATTAGTAGT GTAATCACCC AAAATCCTGA actatgtccc 
SgISS? ?aggatttag tcagatgatc ttcgataact gtgaatcttt gacttcagat 
S^cIgCgI SStGTCAT ACCrCACGCA tcggcgattt acgctacaac gcccatgctc 
SIciS?A atgactccat actattccaa tacaaccgtt ctgcaggatt tggagctgcc 
IttcgJ^S cSgcatcac aatagaaaat acgaaaaaga gccttctctt taatggtaat 
ggItSa?S c^^tcgagg ggccctcacg ggatctgcag cgatcaacct catcaacaat 
agSctSg tc^tttctc aacgaatgct acagggatct atggtggggc tatttacctt 
a?cggIgS? SSgctcac ctctgggaac ctctcaggag tcttgttcgt ttataatagc 

TCGCGCT 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 262 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
Met Lys Thr Ser He Arg Lys Phe Leu lie Ser Thr Thr Leu Ala Pro 
Cys Phe Ala Ser Thr Ala Phe Thr Val llu Val He Met Pro Ser Glu 
Asn Phe ASP o'ly Ser Ser Gly Lys He Phe Pro Tyr Thr Thr Leu Ser 

Asp Pro Arg Gly Thr Leu Cys lie Phe Ser Gly Asp Leu Tyr He Ala 

50 5^ 
Asn Leu Asp Asn Ala He Ser Arg Thr Ser Ser Ser Cys Phe Ser Asn 

Irg Ala Gly Ala Leu lln He Leu Gly Lys Gly Gly Val Phe Ser Phe 
85 

Leu Asn He Arg Ser Ser Ala Asp Gly Ala Ala He Ser Ser Val He 
100 

Thr Gin Asn Pro Glu Leu Cys Pro Leu Ser Phe Ser Gly Phe Ser Gin 

Met He Phe Asp Asn Cys Glu Ser Leu Thr Ser Asp Thr Ser Ala Ser 

Asn III He Pro His Ala Ser Ala He Tyr Ala Thr Thr Pro Met Leu 

Phe Thr Asn Asn Asp Ser He Leu Phe Gin Tyr Asn Arg Ser Ala Gly 

165 1^5 

Phe Gly Ala Ala He Arg Gly Thr Ser He Thr He Glu Asn Thr Lys 
180 

Lys ser Leu Leu Phe Asn Gly Asn Gly Ser He Ser Asn Gly Gly Ala 
195 200 205 

Leu Thr Gly Kla Ala He Aon Lou He Asn fisn . S^TT fila Pro Val 

210 215 220 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
787 
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lie Phe Ser Thr Asn Ala Thr Gly Tie Tyr Gly Glv Ala lie Tyr Leu 

230 235 ' 240 

Thr Gly Gly Ser Met Leu Thr Ser Gly Asn Leu Ser Gly Val Leu Phe 

245 250 255 

Val Tyr Asn Ser Ser Arg 
260 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 2838 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ATGAAGACTT CAGTTTCTAT GTTGTTGGCC CTGCTTTGCT CGGGGGCTAG CTCTATTGTA 
CTCCATGCCG CAACCACTCC ACTAAATCCT GAAGATGGGT TTATTGGGGA GGGCAATACA 
AATACTTTTT CTCCGAAATC TACAACGGAT GCTGCAGGAA CTACCTACTC TCTCACAGGA 
GAGGTTCTGT TTATAGATCC GGGGAAAGGT GGTTCAATTA CAGGAACTTG CTTTGTAGAA 
ACTGCTGGCG ATCTTACATT TTTAGGTAAT GGAAATACCC TAAAGTTCCT GTCGGTAGAT 
GCAGGTGCTA ATATCGCGGT TGCTCATGTA CAAGGAAGTA AGAATTTAAG CTTCACAGAT 
TTCCTTTCTC TGGTGATCAC AGAATCTCCA AAATCCGCTG TTAGTACAGG AAAAGGTAGC 
CTAGTCAGTT CAGGTGCAGT CCAACTGCAA GATATAAACA CTCTAGTTCT TACAAGCAAT 
GCCTCTGTCG AAGATGGTGG CGTGATTAAA GGAAACTCCT GCTTGATTCA GGGAATCAAA 
AATAGTGCGA TTTTTGGACA AAATACATCT TCGAAAAAAG GAGGGGCGAT CTCCACGACT 

CAAGGACTCA CCATAGAGAA TAACTTAGGG ACGCTAAAGT TCAATGAAAA CAAAGCAGTG 660 

ACCTCAGGAG GCGCCTTAGA TTTAGGAGCC GCGTCTACAT TCACTGCGAA CCATGAGTTG 720 
ATATTTTCAC AAAATAAGAC TTCTGGGAAT GCTGCAAATG GCGGAGCCAT AAATTGCTCA 
GGCGACCTAA CATTTACTGA TAACACTTCT TTGTTACTTC AAGAAAATAG CACAATGCAG 
GATGGTGGAG CTTTGTGTAG CACAGGAACC ATAAGCATTA CCGGTAGTGA TTCTATCAAT 
GTGATAGGAA ATACTTCAGG ACAAAAAGGA GGAGCGATTT CTGCAGCTTC TCTCAAGATT 

TTGGGAGGGC AGGGAGGCGC TCTCTTTTCT AATAACGTAG TGACTCATGC CACCCCTCTA 102 0 

GGAGGTGCCA TTTTTATCAA CACAGGAGGA TCCTTGCAGC TCTTCACTCA AGGAGGGGAT 1080 

ATCGTATTCG AGGGGAATCA GGTCACTACA ACAGCTCCAA ATGCTACCAC TAAGAGAAAT 114 0 

GTAATTCACC TCGAGAGCAC CGCGAAGTGG ACGGGACTTG CTGCAAGTCA AGGTAACGCT 1200 

ATCTATTTCT ATGATCCCAT TACCACCAAC GATACGGGAG CAAGCGATAA CTTACGTATC 1260 

AATGAGGTCA GTGCAAATCA AAAGCTCTCG GGATCTATAG TATTTTCTGG AGAGAGATTG 1320 

TCGACAGCAG AAGCTATAGC TGAAAATCTT ACTTCGAGGA TCAACCAGCC TGTCACTTTA 13 80 

GTAGAGGGGA GCTTAGAACT TAAACAGGGA GTGACCTTGA TCACACAAGG ATTCTCGCAG 14 40 

GAGCCAGAAT CCACGCTTCT TTTGGATTTG GGGACCTCAT TACAAGCTTC TACAGAAGAT 1500 

ATCGTCATCA CAAATTCATC TATAAATGCC GATACCATTT ACGGAAAGAA TCCAATCAAT 1560 

ATTGTAGCTT CAGCAGCGAA TAAGAACATT ACCCTAACAG GAACCTTAGC ACTTGTAAAT 1620 

GCAGATGGAG CTTTGTATGA GAACCATACC TTGCAAGACT CTCAAGATTA TAGCTTTGTA 1680 

AAGTTATCTC CAGGAGCGGG AGGGACTATA ATTACTCAAG ATGCTTCTCA GAAGCTTCTT 1740 

GAAGTAGCTC CTTCTAGACC ACATTATGGC TATCAAGGAC ATTGGAATGT GCAAGTCATC 1800 

CCAGGAACGG GAACTCAACC GAGCCAGGCA AATTTAGAAT GGGTGCGGAC AGGATACCTT 1860 

CCGAATCCCG AACGGCAAGG ATTTTTAGTT CCCAATAGCC TGTGGGGTTC TTTTGTTGAT 192 0 

CAGCGTGCTA TCCAAGAAAT CATGGTAAAT AGTAGCCAAA TCTTATGTCA GGAACGGGGA 1980 

GTCTGGGGAG CTGGAATTGC TAATTTCCTA CATAGAGATA AAATTAATGA GCACGGCTAT 2 04 0 

CGCCATAGCG GTGTCGGTTA TCTTGTGGGA GTTGGCACTC ATGCTTTTTC TGATGCTACG 2100 

ATAAATGCGG CTTTTTGCCA GCTCTTCAGT AGAGATAAAG ACTACGTAGT ATCCAAAAAT 216 0 

CATGGAACTA GCTACTCAGG GGTCGTATTT CTTGAGGATA CCCTAGAGTT TAGAAGTCCA 2 22 0 

CAGGGATTCT ATACTGATAG CTCCTCAGAA GCTTGCTGTA ACCAAGTCGT CACTATAGAT 2280 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



780 
840 
900 
960 
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ATGCAGTTGT CTTACAGCCA TAGAAATAAT GATATGAAAA CCAAATACAC GACATATCCA 234 0 

GAAGCTCAGG GATCTTGGGC AAATGATGTT TTTGGTCTTG AGTTTGGAGC GACTACATAC 2 4 00 

TACTACCCTA ACAGTACTTT TTTATTTGAT TACTACTCTC CGTTTCTCAG GCTGCAGTGC 2460 

ACCTATGCTC ACCAGGAAGA CTTCAAAGAG ACAGGAGGTG AGGTTCGTCA CTTTACTAGC 2 520 

GGAGATCTTT TCAATTTAGC AGTTCCTATT GGCGTGAAGT TTGAGAGATT TTCAGACTGT 2 580 

AAAAGGGGAT CTTATGAACT TACCCTTGCT TATGTTCCTG ATGTGATTCG CAAAGATCCC 2 640 

AAGAGCACGG CAACATTGGC TAGTGGAGCT ACGTGGAGCA CCCACGGAAA CAATCTCTCC 2 700 

AGACAAGGAT TACAACTGCG TTTAGGGAAC CACTGTCTCA TAAATCCTGG AATTGAGGTG 2 760 

TTCAGTCACG GAGCTATTGA ATTGCGGGGA TCCTCTCGTA ATTATAACAT CAATCTCGGG 2 820 

GGTAAATACC GATTTTAA 

(2) INFORriATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 6 amino acids 

(B) TYPE: amino acid 

iC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Lys Thr Ser Val Ser Met Leu Leu Ala Leu Leu Cys Ser Gly Ala 

1 5 10 ^ 15 

Ser Ser lie Val Leu His Ala Ala Thr Thr Pro Leu Asn Pro Glu Asp 

20 25 30 

Gly Phe He Gly Glu Gly Asn Thr Asn Thr Phe Ser Pro Lys Ser Thr 

35 40 45 

Thr Asp Ala Ala Gly Thr Thr Tyr Ser Leu Thr Gly Glu Val Leu Phe 

50 55 60 

He Asp Pro Gly Lys Gly Gly Ser He Thr Gly Thr Cys Phe Val Glu 

70 75 80 

Thr Ala Gly Asp Leu Thr Phe Leu Gly Asn Gly Asn Thr Leu Lys Phe 

85 90 95 

Leu Ser Val Asp Ala Gly Ala Asn He Ala Val Ala His Val Gin Gly 

100 105 110 

Ser Lys Asn Leu; Ser Phe Thr Asp Phe Leu Ser Leu Val He Thr Glu 

115 120 125 

Ser Pro Lys Ser Ala Val Ser Thr Gly Lys Gly Ser Leu Val Ser Ser 

130 135 140 

Gly Ala Val Gin Leu Gin Asp He Asn Thr Leu Val Leu Thr Ser Asn 
145 150 155 

Ala Ser Val Glu Asp Gly Gly Val He Lys Gly Asn Ser Cys Leu He 

165 170 175 

Gin Gly He Lys Asn Ser Ala He Phe Gly Gin Asn Thr Ser Ser Lys 

180 185 
Lys Gly Gly Ala He Ser Thr Thr Gin Gly Leu Thr He Glu Asn Asn 

195 200 205 

Leu Gly Thr Leu Lys Phe Asn Glu Asn Lys Ala Val Thr Ser Gly Gly 

210 215 220 

Ala Leu Asp Leu Gly Ala Ala Ser Thr Phe Thr Ala Asn His Glu Leu 

230 235 240 

He Phe Ser Gin Asn Lys Thr Ser Gly Asn Ala Ala Asn Gly Gly Ala 
245 250 255 



wo 98/58953 

PCT/DK98/00266 

72 

Leu Gin Glu Asn Ser Thr Met Gin Asp Gly Gly Ala Leu Cys Ser TTTr 
Oly Thr Xle Ser lie T.r Gly Asp Ser lie Asn HI xie Gly Asn 

Thr ser Gly Gin Lys Gly Gly Ala He Ser Ala III Ser Leu Lys He 
Leu Gly Gly Gin Gly Gly Ala Leu Phe Ser III Asn Val Val Thr His 
Ala Thr Pro Leu Gly Gly Ala He Phe III Asn Thr Gly Gly Ser Leu 
Gin Leu Phe Thr Gin Gly Gly Asp III Val Phe Glu Gly III Gin Val 
Thr Thr Thr Ala Pro Asn Ala ^hr Thr Lys Arg Asn HI xle His Leu 
Glu ser Thr Ala Lys Trp ^hr Gly Leu Ala Ala III Gin Gly Asn Ala 
lie Tyr Phe Tyr Asp Pro lie Thr Thr Asn III Thr Gly Ala Ser Isp 



410 



Asn Leu Arg lie Asn Glu Val Ser Ala A^^ Gin Lys Leu Ser Gl'y Ser 



420 



425 



He val Phe Ser Gly Glu Arg Leu Ser Thr Ala Glu Al 



435 



440 



430 

a He Ala Glu 



Asn Leu Thr Ser Arg Xle Asn Gin Pro Val Thr Leu Jal Glu Gly Ser 



455 



Leu Glu Leu Lys Gin Gly Val Thr Leu Xle Thr Gin Gly Phe Ser Gin 
Glu Pro Glu ser Thr Leu Leu Leu Asp Leu Gl'y Thr Ser Leu Gin 111 
ser Thr Glu Asp Xle Val Xle Thr Asn S^r Ser Xle Asn Ala Asp Thr 



500 



lie Tyr Gly Lys Asn Pro Xle Asn H^ Val Ala Ser Ala Al'a Asn Lys 



520 525 



Asn lie Thr Leu Thr Gly Thr Leu Ala Leu Val Asn Ala Asp Gly Ala 

Leu Tyr Glu Asn His Thr Leu Gin Asp Ser Gin Asp Tyr Ser Phe Val 

550 555 
Lys Leu Ser Pro Gly Ala Gly Gly Thr lie Xle Thr Gin Asp Ala III 

am Lys Leu Jeu Glu Val Ala Pro Ser Arg Pro His Tyr Gly ^^r Gin 



585 



Gly His Trp Asn Val Gin Val lie Pro Gly Thr Gly Thr Gin Pro Ser 

Gin Ala Asn Leu Glu Trp Val Irg Thr Gly Tyr Leu Pro Asn Pro Glu 

615 620 
Arg Gin Gly Phe Leu Val Pro Asn Ser Leu Trp Gly Ser Phe Val Asp 

Gin Arg Ala lie Gin Glu lie Met Val Asn ler Ser Gin lie Leu 

64 5 6 50 

Gin Glu Arg Gly Val Trp Gly Ala Gly Xle Ala Asn Phe Leu His Arg 

ASP Lys lie Asn Glu His Gly Tyr "g His Ser Gly Val Gl'y Tyr Leu 

val Gly val Gly Thr His Ala Phe Ser Asp Ala Thr lie Asn Ala Al 



700 



Phe cys Gin Leu Phe Ser Arg Asp Lys Asp Tyr Val V.l Ser Lys Asn 



715 



His Gly Thr Ser Tyr Ser Gly Val Val Phe Leu Glu Asp Th 



720 
r Leu Glu 
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725 '730 ^735 

Phe Arg Ser Pro Gin Gly Phe Tyr Thr Asp Ser Ser Ser Glu Ala Cys 

740 '750 
Cys Asn Gin Val Val Thr He Asp Met Gin Leu Ser Tyr Ser His Arg 

75S 760 765 

Asn Asn Asp Met Lys Thr Lys Tyr Thr Thr Tyr Pro Glu Ala Gin Gly- 

770 775 780 

Ser Trp Ala Asn Asp Val Phe Gly Leu Glu Phe Gly Ala Thr Thr Tyr 
785 790 795 BOO 

Tvr Tyr Pro Asn Ser Thr Phe Leu Phe Asp Tyr Tyr Ser Pro Phe Leu 

805 810 
Arg Leu Gin Cys Thr Tyr Ala His Gin Glu Asp Phe Lys Glu Thr Gly 

820 825 830 

Gly Glu Val Arg His Phe Thr Ser Gly Asp Leu Phe Asn Leu Ala Val 

835 840 845 

Pro He Gly Val Lvs Phe Glu Arg Phe Ser Asp Cys Lys Arg Gly Ser 

850 ' 855 860 

Tyr Glu Leu Thr Leu Ala Tyr Val Pro Asp Val He Arg Lys Asp Pro 
865 870 875 880 

LVS Ser Thr Ala Thr Leu Ala Ser Gly Ala Thr Trp Ser Thr His Gly 

885 890 895 

Asn Asn Leu Ser Arg Gin Gly Leu Gin Leu Arg Leu Gly Asn His Cys 

900 905 910 

Leu He Asn Pro Gly He Glu Val Phe Ser His Gly Ala He Glu Leu 

915 920 925 

Arci Gly Ser Ser Arg Asn Tyr Asn He Asn Leu Gly Gly Lys Tyr Arg 
930 935 940 

Phe 
945 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3000 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 259... 3000 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 5 : 

ATCAGGTGAT AAAAGTTCCT CGTTAGCTAG TGACTGTAGG TGACATGAGA AAGCTAACAC 
GGAGGAAACT AAAACCCAAG GAATCGAAGT CTTCATGGTA ATGCTTTTGT TTTTTAGAGA 
ACTATTCGCA TCAATATAGA AACAAAATAA GTAAATCAAG TTAAAGATGA CAAAACAGCT 
GTCAAGAATT TTTATCTTGA CTCTCTGAGT TTTCTATTTT ATATGACGCA AGTAAGAATT 
TAATAATAAA GTGGGTTT ATG AAA TCG CAA TTT TCC TGG TTA GTG CTC TCT 

Met Lys Ser Gin Phe Ser Trp Leu Val Leu Ser 
1 5 ^ 



60 
120 
180 
240 
291 
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I- s ^ ^: s ?s 

^" «i s s I- 5 z ^ I z ?s 

s ?F - ^" - ?s ^i: ;i ™ 

ACA GGA GAT ATA ACT CTG CAA AAP n-TT r-r-r- r^nr^ 

™. ™. . .T. Z Z Z IT. S: SI s ?s 

" 70 ^5 

AAG GGT TGT TTT TCT GAG ACT ACG GAA TCT rra ^nr^ 

.v= c,, ..e s„ ..p ... ™. z s 

^5 90 

100 y 

GCA CTT TCT GTT ACA ACT GAT AAA AAT CTG TCC r-v^ .n. 
Ala Leu Ser Val Thr Thr Asd r 72 7 ™ '^^'^ '^'^ TCG 

Thr Thr Asp Lys Asn Leu Ser Leu Thr Gly Phe Ser 

-^.-^---^^^^^^^ 

^•^^ 135 

=s ^ IT, ^ Ty r r -^"^ «^ 

140 '^SP Leu Thr Phe Asp Asn Asn 

150 

GGA ACT ATT TTA TTT AAA CAA GAT TAC TCT r^n n^. „ 

0.y X.. „e P.e oi„ ..p Z Ty IT. 

170 

ATT TCT ACC AAG AAT CTT TPT T-rr- a a n 

"e s„ ... I- 1- - - .oc 

185 

^ c?? z IT tT IT. r.". iTyrr ^ -^^ 

190 Gly Gly Ala He 

S - 5- - Z Z iT Z iT 
i IT Z Z Z Z Z Z ^y TTy Z S Z TT ?S 

230 23 



339 



387 



435 



483 



531 



579 



627 



675 



723 



771 



819 



867 



915 



963 



1011 
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Gly Asn Cys Thr He Thr Gly Asn Thr Ser Leu Val Phe Ser Glu Asn 
240 245 250 

AGT GTG ACA GCG ACC GCA GGA AAT GGA GGA GCT CTT TCT GGA GAT GCC 1059 
ser val Thr Ala Thr Ala Gly Asn Gly Gly Ala Leu Ser Gly Asp Ala 
255 260 265 

GAT GTT ACC ATA TCT GGG AAT CAG AGT GTA ACT TTC TCA GGA AAC CAA 1107 
ASP val Thr He Ser Gly Asn Gin Ser Val Thr Phe Ser Gly Asn Gin 
270 275 280 



GCT GTA GCT AAT GGC GGA GCC ATT TAT GCT AAG AAG CTT ACA CTG GCT 
Ala val Ala Asn Gly Gly Ala He Tyr Ala Lys Lys Leu Thr Leu Ala 



1155 



235 290 295 



TCC GGG GGG GGG GGG GGT ATC TCC TTT TCT AAC AAT ATA GTC CAA GGT 1203 
Ser Gly Gly Gly Gly Gly He Ser Phe Ser Asn Asn He Val Gin Gly 
300 305 310 315 

ACC ACT GCA GGT AAT GGT GGA GCC ATT TCT ATA CTG GCA GCT GGA GAG 1251 
Thr Thr Ala Gly Asn Gly Gly Ala He Ser He Leu Ala Ala Gly Glu 
320 325 330 

TGT AGT CTT TCA GCA GAA GCA GGG GAC ATT ACC TTC AAT GGG AAT GCC 129 9 
Cys ser Leu Ser Ala Glu Ala Gly Asp He Thr Phe Asn Gly Asn Ala 
335 340 345 

ATT GTT GCA ACT ACA CCA CAA ACT ACA AAA AGA AAT TCT ATT GAC ATA 
He Val Ala Thr Thr Pro Gin Thr Thr Lys Arg Asn Ser He Asp He 
350 355 360 

GGA TCT ACT GCA AAG ATC ACG AAT TTA CGT GCA ATA TCT GGG CAT AGC 
Gly Ser Thr Ala Lys He Thr Asn Leu Arg Ala He Ser Gly His Ser 
365 370 375 

ATC TTT TTC TAC GAT CCG ATT ACT GCT AAT ACG GCT GCG GAT TCT ACA 
He Phe Phe Tvr Asp Pro He Thr Ala Asn Thr Ala Ala Asp Ser Thr 
380 " . 385 390 395 

GAT ACT TTA AAT CTC AAT AAG GCT GAT GCA GGT AAT AGT ACA GAT TAT 
Asp Thr Leu Asn Leu Asn Lys Ala Asp Ala Gly Asn Ser Thr Asp Tyr 
400 405 410 

AGT GGG TCG ATT GTT TTT TCT GGT GAA AAG CTC TCT GAA GAT GAA GCA 
Ser Gly Ser He Val Phe Ser Gly Glu Lys Leu Ser Glu Asp Glu Ala 
415 420 425 

AAA GTT GCA GAC AAC CTC ACT TCT ACG CTG AAG CAG CCT GTA ACT CTA 
Lys Val Ala Asp Asn Leu Thr Ser Thr Leu Lys Gin Pro Val Thr Leu 
430 435 440 

ACT GCA GGA AAT TTA GTA CTT AAA CGT GGT GTC ACT CTC GAT ACG AAA 
Thr Ala Gly Asn Leu Val Leu Lys Arg Gly Val Thr Leu Asp Thr Lys 
445 450 455 

TT ' l - kCl - LAb ACC GCG OOT TCC TCT GTT ATT T^TG GAT GPfi GfiC ACA 1683. 

Gly Phe Thr Gin Thr Ala Gly Ser Ser Val He Met Asp Ala Gly Thr 



1347 



1395 



1443 



1491 



1539 



1587 



1635 



wo 98/58953 

PCT/DK98/00266 

76 



460 



465 



470 



47^ 



ACG TTA AAA GCA AGT ACA GAG GAG GTC ACT TTA ACA GGT CTT TCC ATT 
Thr Leu Lys Ala Ser Thr Glu Glu Val Thr Leu Thr Gly Leu Ser lH 
''SO 435 

CCT GTA GAC TCT TTA GGC GAG GGT AAG AAA GTT GTA ATT GCT GOT TCT 
Pro val ASP Ser Leu Gly Glu Gly Lys Lys Val Val He All 111 HI 

500 



520 



AAC CAA GGG AAT GCT TAT GAA AAT CAC GAC TTA GGA AAA ACT CAA GAC 
Asn Gin Gly Asn Ala Tyr Glu Asn His Asp Leu Gly ^ Sp 

b30 535 



S Zt I- tz Z ?S ?S ?S z 



"0 555 



GTT CCA GCG GTT CCT ACA GTA GCA ACT CCT ACG CAC TAT GGG TAT CAA 
val Pro Ala Val Pro Thr Val Ala Thr Pro Thr H.s Tyr Sy lyl oTn 



570 



til 1""° rT ^'"'^ '''''' ""^^ ^'^^ AGC.ACT CCA AAG 

Gly Thr Trp Gly Met Thr Trp Val Asp Asp Thr Ala Ser Thr Pro 



580 



585 



f Jk'' '''''' """^ "^^^ GGC TAC CTT CCG AAT 

Thr Lys Thr Ala Thr Leu Ala Trp Thr Asn Thr Gly Tyr Le^ Pro 



600 



CCT GAG CGT CAA GGA CCT TTA GTT CCT AAT AGC CTT TGG GGA TCT TTT 
Pro Glu Arg Gin Gly Pro Leu Val Pro Asn Ser Leu Tr^ Gly ler Phi 



o ^ b 



630 



635 



CTT TGT TCA GAT CGA GGC TTC TGG GCT GCG GGA GTC GCC AAT TTC TTA 
I^eu Cys Ser Asp Arg Gly Phe Trp Ala Ala Gly Val A^a III ITu 



e^JS 650 



GAT AAA GAT AAG AAA GGG GAA AAA CGC AAA TAC CGT CAT AAA TCT GGT 
Asp Lys ASP Lys Lys Gly Glu Lys Arg Lys Tyr Arg His Ser Gly 



"0 665 



GGA TAT GCT ATC GGA GGT GCA GCG CAA ACT TGT TCT GAA AAC TTA ATT 
Gly Tyr Ala He Gly Gly Ala Ala Gin Thr Cys Ser Glu ^n leu Zll 



^'^^ 680 



ser 7Z T ir ^-^^ ^ GAT TTC TTA GTC 

ser Phe Ala Phe Cys. Gin Leu Phe Gly Ser Asp Lys Asp Phe Zu til 



1731 



1779 



GCA GCA AGT AAA AAT GTA GCC CTT AGT GGT CCG ATT CTT CTT T^r n.-r 
Ala Ala Ser Lys Asn Val Ala Leu Ser Gly Pro ill IZ Zl Zl Z 



1875 



1923 



1971 



2019 



2067 



2115 



2163 



2211 



2259 



2307 



2355 



695 
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GCT AAA AAT CAT ACT GAT ACC TAT GCA GGA GCC TTC TAT ATC CAA CAC 2 4 03 

Ala Lys Asn His Thr Asp Thr Tyr Ala Gly Ala Phe Tyr lie Gin His 

700 705 710 715 

ATT ACA GAA TGT AGT GGG TTC ATA GGT TGT CTC TTA GAT AAA CTT CCT 2 4 51 

He Thr Glu Cys Ser Gly Phe He Gly Cys Leu Leu Asp Lys Leu Pro- 
720 725 730 

GGC TCT TGG AGT CAT AAA CCC CTC GTT TTA GAA GGG CAG CTC GCT TAT 24 99 

Gly Ser Trp Ser His Lys Pro Leu Val Leu Glu Gly Gin Leu Ala Tyr 
735 740 745 

AGC CAC GTC AGT AAT GAT CTG AAG ACA AAG TAT ACT GCG TAT CCT GAG 2 54 7 
Ser His Val Ser Asn Asp Leu Lys Thr Lys Tyr Thr Ala Tyr Pro Glu 
750 755 760 

GTG AAA GGT TCT TGG GGG AAT AAT GCT TTT AAC ATG ATG TTG GGA GCT 2 595 

Val Lys Gly Ser Trp Gly Asn Asn Ala Phe Asn Met Met Leu Gly Ala 
765 770 775 

TCT TCT CAT TCT TAT CCT GAA TAC CTG CAT TGT TTT GAT ACC TAT GCT 2 64 3 

Ser Ser His Ser Tyr Pro Glu Tyr Leu His Cys Phe Asp Thr Tyr Ala 
780 785 790 795 

CCA TAC ATC AAA CTG AAT CTG ACC TAT ATA CGT CAG GAC AGC TTC TCG 2 691 

Pro Tyr He Lys Leu Asn Leu Thr Tyr He Arg Gin Asp Ser Phe Ser 
800 805 810 

GAG AAA GGT ACA GAA GGA AGA TCT TTT GAT GAC AGC AAC CTC TTC AAT 2 73 9 

Glu Lys Gly Thr Glu Gly Arg Ser Phe Asp Asp Ser Asn Leu Phe Asn 
815 820 825 

TTA TCT TTG CCT ATA GGG GTG AAG TTT GAG AAG TTC TCT GAT TGT AAT 27 87 

Leu Ser Leu Pro He Gly Val Lys Phe Glu Lys Phe Ser Asp Cys Asn 
830 835 840 

GAC TTT TCT TAT GAT CTG ACT TTA TCC TAT GTT CCT GAT CTT ATC CGC 2 83 5 

Asp Phe Ser Tyr Asp Leu Thr Leu Ser Tyr Val Pro Asp Leu He Arg 
845 850 855 

AAT GAT CCC AAA TGC ACT ACA GCA CTT GTA ATC AGC GGA GCC TCT TGG 2 8 83 
Asn Asp Pro Lys Cys Thr Thr Ala Leu Val He Ser Gly Ala Ser Trp 
860 865 870 875 

GAA ACT TAT GCC AAT AAC TTA GCA CGA CAG GCC TTG ' CAA GTG CGT GCA 2 931 
Glu Thr Tyr Ala Asn Asn Leu Ala Arg Gin Ala Leu Gin Val Arg Ala 
880 885 890 

GGC AGT CAC TAC GCC TTC TCT CCT ATG TTT GAA GTG CTC GGC CAG TTT 2 97 9 
Gly Ser His Tyr Ala Phe Ser Pro Met Phe Glu Val Leu Gly Gin Phe 
895 900 905 



GTC TTT GAA GTT CGT GGA TCC 
Val Phe Glu Val Arg Gly Ser 
910 



3000 
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(2) INFORriATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 914 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
Met Lvs Ser Gin Phe Ser Trp Leu Val Leu Ser Ser Thr Leu Ala Cys 
Phe Thr ser Cys Ser Thr Val Phe Ala Thr Ala Glu Asn lie Gly 

Pro ser Asp Ser Phe Asp Gly Ser ^hr Asn Thr Gly Thr Tyr Thr Pro 
Lys Asn Thr Thr Thr Gly lie Isp Tyr Thr Leu Thr Gly Asp He Thr 
Leu Gin Asn Leu Gly Asp Ser Ala Ala Leu Thr Lys Gly Cys Phe Ser 
ASP Thr Thr Glu Ser Leu Ser Phe Ala Gly lys Gly Tyr Ser Leu 11 
Phe Leu Asn Xle Lys Ser Ser Ala Glu G^y Ala Ala Leu Ser Val Thr 
Thr ASP Lys Asn Leu Ser Leu Thr oH Phe Ser Ser Leu T^r Phe Leu 
Ala Ala Pro Ser Ser Val lie Thr Thr Pro Ser Gly ^ys Gly Ala Val 
Lys cys Gly Gly Asp Leu Thr Phe Asp Asn Asn Oil Thr He Leu Phe 
Lys Gin ASP Tyr Cys Glu Glu Asn Gly Gly HI He Ser Thr Lys HI 
Leu Ser Leu Lys Asn Ser Thr Gly Ser III Ser Phe Glu Gly Asn Lys 
ser ser Ala Thr Gly Lys Lys Gly Gly Ala He Cys Ala ^hr Gly Thr 
val ASP He Thr Asn Asn Thr HI Pro Thr Leu Phe Ser Asn Asn He 
Ala Glu Ala Ala Gly Gly HI He Asn Ser Thr G^y Asn Cys Thr He 
Thr Gly Asn Thr Ser Leu Val Phe Ser Glu HI Ser Val Thr Ala ^hr 
Ala Gly Asn Gly Gly Ala Leu Ser Gly Asp Ala Asp Val Thr h" Ser 
Gly Asn Gin Ser Val Thr Phe Ser GlJ Asn Gin Ala Val HI Asn Gly 

Oly Ala He .Vr Ala Lys Lys l" Thr Leu Ala Ser GlJ Gly Gly Gly 

300 

Gly He ser Phe Ser Asn Asn He Val Gin Gly Thr Thr Ala Gly Asn 
Gly Gly Ala He Ser He Leu Ala Ala Gly HI Cys Ser Leu Ser A^a 
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330 



Glu Ala Gly Asp He Thr Phe Asn Gly Asn Ala He Val Ala Thr Thr 



345 



350 
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pro Gin Thr Thr Lys Arg Asn Ser He Asp He Gly Ser Thr Ala Lys 

Job 



lie Thr III Leu Arg Ala He Ser Gly His Ser He Phe Phe Tyr Asp 



360 



380 

370 



Pro lie Thr Ala Asn Thr Ala Ala Asp Ser Thr Asp Thr Leu Asn Leu 
Asn Lys Ala Asp Ala Gly Asn Ser Thr Asp Tyr Ser Gly Ser He Val 



din 415 

405 



Phe ser Gly Glu Lys Leu Ser Glu Asp Glu Ala Lys Val Ala Asp Asn 
Leu Thr ser Thr Leu Lys Gin Pro III Thr Leu Thr Ala Gly Asn Leu 
val Leu Lys Arg Gly Val Thr Leu Asp Thr Lys Gly Phe Thr Gin Thr 
Ala Gly ser Ser Val He Met Asp Ala Gly Thr Thr Leu Lys Ala Ser 



Thr Glu Glu val Thr leu Thr Gly Leu Ser He Pro Val Asp Ser Leu 



470 



485 



490 



495 



Gly Glu Gly Lys Lys Val Val He Ala Ala Ser Ala Ala Ser Lys Asn 

500 505 
val Ala Leu Ser Gly Pro He Leu Leu Leu Asp Asn Gin Gly Asn Ala 

52 0 52 5 

Tyr Glu III His ASP Leu Gly Lys Thr Gin Asp Phe Ser Phe Val Gin 

l.eu III Ala Leu Gly Thr "la Thr Thr Thr Asp vll Pro Ala Val Pro 



550 



Thr val Ala Thr Pro Th^ His Tyr Gly Tyr Gin Gly Thr Trp Gly Met 

565 570 
Thr Trp val Asp Asp Thr Ala Ser Thr Pro Lys Thr Lys Thr Ala Thr 

Q 585 ->yu 

Leu Ala Trp ^hr Asn Thr Gly Tyr Leu Pro Asn Pro Glu Arg Gin Gly 

S9B SOO 
pro Leu val Pro Asn Ser Leu Trp Gly Ser Phe Ser Asp He Gin Ala 



lie Gin Gly Val He Glu Ser Ala Leu Thr Leu Cys Ser Asp Arg 



615 



620 



630 



635 



640 



Gly Phe Trp Ala Ala Gly Val Ala Asn Phe Leu Asp Lys Asp Lys Lys 

Gly Glu Lys Arg Lys Tyr Arg His Lys HI Gly Gly Tyr Ala He Gly 

60 665 o/u 

Gly Ala Ala Gin Thr Cys Ser Glu Asn Leu He Ser Phe Ala Phe Cys 

6 80 bob 
Gin Leu Phe Gly Ser Asp Lys Asp Phe Leu Val Ala Lys Asn His Thr 

ASP Th°r Tyr Ala Gly Ala Phe Tyr He Gin His He Thr Glu Cys Ser 



G°iy Phe He Gly Cys L^;; Leu Asp Lys Leu Pro Gly Ser Trp Ser His 

Lys Pro Leu Val leu Glu Gly Gin Leu Ala Tyr Ser His Val Ser Asn 

74 5 f ~>\) 

ASP Leu Lys Thr Lys Tyr Thr Ala Tyr Pro Glu Val Lys Gly Ser Trp 

755 ''^O 
Gly Asn Asn Ala Phe Asn Met Met Leu Gly Ala Ser Ser His Ser Tyr 

775 780 
Pro gL Tyr Leu His Cys Phe Asp Thr Tyr Ala Pro Tyr He Lys Leu 

Isn Leu Thr Tyr He Arg Gin Asp Ser Phe Ser Glu Lys Gly Thr Glu 



710 
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810 



Gly Arg Ser Phe Asp Asp Ser Asn Leu Phe Asn Leu r " 

820 g Ser Leu Pro lie 

Gly Val Lys Phe Glu Lys Phe q^r- 

Leu Thr Leu Ser Tvr d>-^ 7^ . 

850 III -^-^ ^^-9 Asn Asp Pro Lys Cys 

Thr Thr Ala Leu Val He Ser ri„ bi c 

865 870 ''^P ""'^ "^^^ -^y^ Ala Asn 



Leu „a oin Ma .e„ CI„ Val „a civ Se, Ty, 

Phe s,. P„ 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1. . . 120O 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 27- 

p- Z r„ ^ - - - ^ ™ z III S 

s ?p «^ 2j 2j ^ .c. ^ :: 

S S ?p ™ - S - JC. 

S ?S S S SI ™ =IT ™ 
™ o?^ ?s ^ ™ s S X S.^ -I 

s ?s 2: ™ -J s s :: 



48 



96 



144 



192 



240 



288 



90 ,3 
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GTT ATT 
Val lie 



ATA GAA 
He Glu 



AGA AAT 
Arg Asn 
130 

GGT ACT 
Gly Ser 
145 

TAT GGT 
Tyr Gly 



AAA GTT 
Lys Val 



GAA AAA 
Glu Lys 



AAT GTC 
Asn Val 

210 

CAG ACA 
Gin Thr 
225 

GTA TCT 
Val Ser 



TAT GTT 
Tyr Val 



ATG GCA 
Met Ala 



AAC AAC 
Asn Asn 
290 

ACC TCC 
Thr Ser 



AAA GCA 
Lys Ala 
100 

CTT ATC 
Leu lie 
115 

TCA CAG 
Ser Gin 



AAC ACC GCA 
Asn Thr Ala 



TCG CCT ACT 
Ser Pro Thr 



GTG ACT 
Val Thr 



TTT CAA 
Phe Gin 



GGA GAA 
Gly Glu 
180 

GAA GGA 
Glu Gly 
195 

AGA TCC 
Arg Ser 



GAT CGA 
Asp Arg 



GCC TCC 
Ala Ser 



CTA TCT 
Leu Ser 
260 

TTT TCC 
Phe Ser 
275 

GAA TAG 
Glu Tyr 



CTA GGG 
Leu Gly 



ACG TTC CCT 
Thr Phe Pro 
135 

GTA ACT GCT 
Val Thr Ala 
150 

GGC AAT TGG 
Gly Asn Trp 
165 

TTC TTC TGG 
Phe Phe Trp 



AAT AAA CAG ATA 
Asn Lys Gin He 
105 

GGC AAT GCC TAT 
Gly Asn Ala Tyr 
120 

CTG CTC TCT TTA 
Leu Leu Ser Leu 



AAT TTA GTT 
Asn Leu Val 



TTA ATG CAG 
Leu Met Gin 
215 

GGG CTG TGG 
Gly Leu Trp 
230 

GAA GAC AAT 
Glu Asp Asn 
245 

GTA AAT AAT 
Val Asn Asn 



CAA CTC TTT 
Gin Leu Phe 



AGA ATG TAT 
Arg Met Tyr 
295 

AAT ATT TTC 
Asn He Phe 
310 



GGA GAT 
Gly Asp 



AAA TTA 
Lys Leu 



GAT AAA 
Asp Lys 
185 

CCT AAT 
Pro Asn 
200 

GTT CAA 
Val Gin 



TTC CTA 
Phe Leu 
155 

GCT TGG 
Ala Trp 
170 

ATA AAT 
He Asn 



ATC TTG 
He Leu 



GAG ACC 
Glu Thr 



ATC GAT GGA ATT 
He Asp Gly He 
235 

ATA AGG TAC CGT 
He Arg Tyr Arg 
250 

GAG ATC ACA CCT 
Glu He Thr Pro 
265 

AGT AGA GAC AAA 
Ser Arg Asp Lys 
280 

TTA GGA TCG TAT 
Leu Gly Ser Tyr 



CGT TAT GCT TCG 
Arg Tyr Ala Ser 
315 



TCC 
Ser 



GAA 
Glu 



GAG 
Glu 
14 0 

CCG 
Pro 



ACA 
Thr 



TAT 

Tvr 



TGG 
Trp 



CAT 
His 
220 

GGG 
Gly 



CAT 
His 



GTG ACG GAC TCT 
Val Thr Asp Ser 
110 

GAT CTC AGA ATG 
Asp Leu Arg Met- 
125 

CCT GGA GCC GGG 
Pro Gly Ala Gly 



GTA AGT 
Val Ser 



GGA ACT 
Gly Thr 



AAG CCT 
Lys Pro 
190 

GGG AAT 
Gly Asn 
205- 

GCA TCG 
Ala Ser 



AAT TTC 
Asn Phe 



AAC AGC 
Asn Ser 



AAG 
Lys 



GAC 
Asp 



CTC 
Leu 
300 

CGT 
Arg 



CAC TAT 
His Tyr 
270 

TAT GCG 
Tyr Ala 
285 

TAT CAA 
Tyr Gin 



CCC CAT 
Pro His 
160 

GGA AAC 
Gly Asn 
175 

AGA CCT 
Arg Pro 



GCT GTA 
Ala Val 



AGC TTA 
Ser Leu 



TTC CAT 
Phe His 
240 

GGT GGA 
Gly Gly 
255 

ACT TCG 
Thr Ser 



GTT TCC 
Val Ser 



TAT ACA 
Tyr Thr 



AAC CCT AAT GTA 
Asn Pro Asn Val 
320 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 



AAC GTC GGG ATT CTC TCA AGA AGG TTT CTT CAA AAT CCT CTT ATG ATT 



1008 
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Asn Val Gly He Leu Ser Ara Arg Phe Leu n n z. . 

325 - ^ "^"^ Gin Asn Pro Leu Met iTe 

335 

TTT CAT TTT TTG TGT OCT TflT rr-r r^.^ 

ph, s: iz lu ?s r 

340 " ^ ^^"^ '^S" Met Lys Thr 

350 

GAC TAC GCA AAT TTC CCT ATC r-rn aa^ 

ASP Tyr Ala Asn Phe S nit vlj f ^ TGT 

3 55 ^ ^^"^ ^'^P Arg Asn Asn Cys 

365 

z Ty Ty Tf. r - 

370 ^ ''^^ ^eu Leu Val Phe Glu Asn 

^ 380 

GGA AAA CTT TTC CAA GOT GCC ^TC CCA ttt a-rr- 

Gly Lys Leu Phe Gin Gly Ala 11^ p ok ^ ^™ GTT 

385 ""^^ P'^^ Met Lys Leu Gin Leu Val 



,00 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 400 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ASP Pro Lys Asn Lys Glu Tyr Thr Gly Thr He Leu Phe Ser Gly Glu 

ser Leu Ala Asn Asp Pro Arg Asp pL Lys Ser Thr He II Gin 

Asn val Asn Leu Ser Ala Gly Tyr Leu Val He Lys Glu lly Ala Glu 

Val Thr val Ser Lys Phe Thr Gin Ser Pro Gly Ser Z Leu Val Leu 

ASP .eu Gly Thr Lys Leu He Ala Ser Lys Glu xie Ala He Thr 

Oly .eu Ala He Asp He Asp Ser Leu Ser II Ser Ser Thr Ala II 

Val He Lys Ala Asn Thr Ala Asn Lvs r^r. n o 

100 Ser Val Thr Asp Ser 

He Glu Leu He Ser Pro Thr Gly Asn Ala Tv. ri . 

115 ^20 ^^"^ ^""^ 

A., se. ai„ ™. P., p„ 3„ III 

se. T., va. r.. ^ 

Tyr a^y P.e 3.„ 01. T,p ^ ^^0 

va. PH. P.. ..p 



1056 



1104 



1152 



1200 
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Glu Lys Glu Gly Asn Leu Val Pro Asn lie Leu Trp Gly Asn Ala Val 

195 200 205 

Asn Val Arg Ser Leu Met Gin Val Gin Glu Thr His Ala Ser Ser Leu 

210 215 220 

Gin Thr Asp Arg Gly Leu Trp lie Asp Gly lie Gly Asn Phe Phe His 
225 230 235 240 - 

Val Ser Ala Ser Glu Asp Asn lie Arg Tyr Arg His Asn Ser Gly Gly 

245 250 255 

Tyr Val Leu Ser Val Asn Asn Glu lie Thr Pro Lys His Tyr Thr Ser 

260 265 270 

Met Ala Phe Ser Gin Leu Phe Ser Arg Asp Lys Asp Tyr Ala Val Ser 

275 280 285 

Asn Asn Glu Tyr Arg Met Tyr Leu Gly Ser Tyr Leu Tyr Gin Tyr Thr 

290 295 300 

Thr Ser Leu Gly Asn lie Phe Arg Tyr Ala Ser Arg Asn Pro Asn Val 
305 310 315 320 

Asn Val Gly lie Leu Ser Arg Arg Phe Leu Gin Asn Pro Leu Met lie 

325 330 335 

Phe His Phe Leu Cys Ala Tyr Gly His Ala Thr Asn Asp Met Lys Thr 

340 345 350 

Asp Tyr Ala Asn Phe Pro Met Val Lys Asn Ser Trp Arg Asn Asn Cys 

355 360 365 

Trp Ala lie Lys Cys Gly Gly Ser Met Pro Leu Leu Val Phe Glu Asn 

370 375 380 

Gly Lys Leu Phe Gin Gly Ala lie Pro Phe Met Lys Leu Gin Leu Val 
385 390 395 400 

(2) INFORMATION FOR SEQ ID NO : 2 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1830 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1 . . . 1830 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GAT CTC ACA TTA GGG AGT CGT GAC AGT TAT AAT GGT GAT ACA AGC ACC 4 8 

Asp Leu Thr Leu Gly Ser Arg Asp Ser Tyr Asn Gly Asp Thr Ser Thr 
15 10 15 

ACA GAA TTT ACT CCT AAA GCG GCA ACT TCT GAT GCT AGT GGC ACG ACC 9 6 

Thr Glu Phe Thr Pro Lys Ala Ala Thr Ser Asp Ala Ser Gly Thr Thr 



20 



25 



30 



TAT 
Tyr 



ATT CTC GAT GGG 
lie Leu Asp Gly 
3^ 



GAT GTC TCG ATA AGC CAA GCA 
Asp Val Ser lie Ser Gin Ala 
M 



GGG AAA CAA ACG 
Gly Lys Gin Thr 
-4^ 



144 
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AGC TTA ACC ACA ACT TGT TTT TCT AAC ACT GCA GGA AAT CTT ACC TTT 
Ser Leu Thr Thr Ser Cys Phe Ser Asn Thr Ala Gly Asn Leu Thr Phe 
^° 55 60 

TTA GGG AAC GGA TTT TCT CTT CAT TTT GAC AAT ATT ATT TCG TCT ACT 
Leu Gly Asn Gly Phe Ser Leu His Phe Asp Asn He He Ser Ser Thr 
65 70 7s 80 

GTT GCA GGT GTT GTT GTT AGC AAT ACA GCA GCT TCT GGG ATT ACG AAA 
Val Ala Gly Val Val Val Ser Asn Thr Ala Ala Ser Gly He Thr Lys 
^5 90 95 

TTC TCA GGA TTT TCA ACT CTT CGG ATG CTT GCA GCT CCT AGG ACC ACA 
Phe Ser Gly Phe Ser Thr Leu Arg Met Leu Ala Ala Pro Arg Thr Thr 
100 105 

GGT AAA GGA GCC ATT AAA ATT ACC GAT GGT CTG GTG TTT GAG ACT ATA 
Gly Lys Gly Ala He Lys He Thr Asp Gly Leu Val Phe Glu Ser He 



115 120 



125 



GGG AAT CTT GAT CCG ATT ACT GTA ACA GGA TCG ACA TCT GTT GCT GAT 
Gly Asn Leu Asp Pro He Thr Val Thr Gly Ser Thr Ser Val Ala Asp 
130 135 

GCT CTC AAT ATT AAT AGC CCT GAT ACT GGA GAT AAC AAA GAG TAT ACG 
Ala Leu Asn He Asn Ser Pro Asp Thr Gly Asp Asn Lvs Glu Tyr Thr 

155 ' ^ leo 

GGA ACC ATA GTC TTT TCT GGA GAG AAG CTC ACG GAG GCA GAA GCT AAA 
Gly Thr He Val Phe Ser Gly Glu Lys Leu Thr Glu Ala Glu Ala Lys 
165 170 

GAT GAG AAG AAC CGC ACT TCT AAA TTA CTT CAA AAT GTT GCT TTT AAA 
Asp Glu Lys Asn Arg Thr Ser Lys Leu Leu Gin Asn Val Ala Phe Lys 
180 185 190 

AAT GGG ACT GTA GTT TTA AAA GGT GAT GTC GTT TTA AGT GCG AAC GGT 
Asn Gly Thr Val Val Leu Lys Gly Asp Val Val Leu Ser Ala Asn Gly 
195 200 205 

TTC TCT CAG GAT GCA AAC TCT AAG TTG ATT ATG GAT TTA GGG ACG TCG 
Phe Ser Gin Asp Ala Asn Ser Lys Leu He Met Asp Leu Gly Thr Ser 

215 220 

TTG GTT GCA AAC ACC GAA AGT ATC GAG TTA ACG AAT TTG GAA ATT AAT 
Leu Val Ala Asn Thr Glu Ser He Glu Leu Thr Asn Leu Glu He Asn 

230 235 240 

ATA GAC TCT CTC AGG AAC GGG AAA AAG ATA AAA CTC AGT GCT GCC ACA 
He Asp Ser Leu Arg Asn Gly Lys Lys He Lys Leu Ser Ala Ala Thr 
245 250 255 

GCT CAG AAA GAT ATT CGT ATA GAT CGT CCT GTT GTA CTG GCA ATT AGC 
Ala Gin Lys Asp He Arg He Asp Arg Pro Val Val Leu Ala He Ser 
260 265 270 

GAT GAG AGT TTT TAT CAA AAT GGC TTT TTG AAT GAG GAC CAT TCC TAT 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 
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.sp Glu Ser Phe Tyr Gin Asn Gly Phe Leu Asn Glu Asp H.s Ser>yr 
275 

OOO .TT CT. »0 TX. J.T OCT C03 ^ O.C .TC OTO .T. TCT 0» 

ASP Gly He Leu Glu Leu Asp Ala Gly Lys Asp iie 



912 



290 



295 



z - Z I- z z I- z - i 



305 



310 



- 1% ?s s ?s =.f. - i 



325 



Tsr^T- TTT zvaT CCC ACT GCT GAG CAG GAG OCT CCG TTA 
1% fv= S S c.u =1„ =i„ ..a 



340 345 



ccr ex. ex. xoo cox xcx xxr ™ o^ oxx cox xcc ^^c c.o 

Val Pro Asn Leu Leu Trp Gly ber Futf p 

360 



355 



- IS ^i: of„ ™ °of. ?s r» z s r.^ ™ 



370 



375 



r^r^. nrr atT TCC AAT GTT TTG CAT AGG AGO GGT CGT GAA AAT 
fal S5a of. ne IS va. ..u A., se. O.y .r, O.u ..n 



385 



390 



'^Z S Z 5- Z IZ S Sa fS 



405 410 



.rr-r^ nnn CCT CGT GAT PCC TTG TCT CTG GGT TTT GCT CAG CTC 
?S M« 'pfo ofv ofv X., S,. Oly ^^^J 



420 425 



IS Jf. J% z z z - ?s r„ r.^^ Sa - ?s 



435 



440 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



TZvr CCA GGA TCT TTA CGT TTG CAG CAC GAT GCT TCC CTA TAC TCT GTG 
TAC GCA GbA iui ^ ■ ggj- Val 

Tyr Ala Gly Ser Leu Arg Leu Gin Hxs Asp Ala Ser Leu Tyr 
450 455 460 

CTC AGT ATC CTT TTA GGA GAG GGA GGA CTC CGC GAG ATC CTG TTG CCT 
vll ser ne lIu Leu Gly Glu Gly Gly Leu Arg Glu He Leu Leu ro 
465 470 475 

TAT GTT TCC AAT ACT CTG CCG TGC TCT TTC TAT GGG CAG CTT AGC TAC 
lyl vH ser Thr Leu Pro Cys Ser Phe Tyr Gly Gin Leu Ser Tyr 

485 

GGC CAT ACG GAT CAT CGC AT. AA. AC. .M. TCT CT.. CCC CCC rcC rrr 1^36 
Gly His Thr Asp His Arg Met Lys Thr Glu Ser Leu Pro Pro 



1440 



1488 
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S05 

510 



CCG ACG CTC TCG ACG GAT CAT ACT Tr^x ^r-^ 

Pro Th. .eu Se. rUr Asp uTs rZ I7r IT, IT ""'^ 

515 ^ ^ ■^y'' Trp Ala 

525 

GGA GAG CTG GGA ACT CGA GTT GCT CTT rnz. 

Gly Glu Leu Gly Thr Arg Val til yll f """^"^ "^^^ ^GA 

530 3 Glu Asn Thr Ser Gly Arg Gly 

54 0 

TTT TTC CGA GAG TAC ACT CCA 

Phe Phe Arg Glu Tyr Pro SI^ T f ^ ^'^^ 

545 "° f-he Val Lys Val Gin Ala Val Tyr Ser 

CGC CAA GAT AGO TTT GTT GAA CTA rr^ r^r^ 

Arg Gin Asp Ser Phe Val G^^ Z.u Gly IZ. f,' T ""^ 

565 ^ 5:;^ Ser Arg Asp Phe Ser 

GAT TCG CAT CTT TAT AAC CTT rnn 

ASP ser His Leu Tyr Tsl lIu AU Zll p""" ^"^^ GAG 

580 Gly He Lys Leu Glu 

590 

AAA CGG TTT GCA GAG CAA TAT tst r-.^ 

^ys Arg Phe Ala Glu ITn IZ If, J!T ^^^^ ^AT TCT CCA 

595 "^^ val Ala Met Tyr Ser Pro 

605 



GAT GTT 
Asp Val 
610 



(2) INFORMATION FOR SEQ ID NO:30: 
(i) SEQUENCE CHARACTERISTICS- 
(A) LENGTH: 610 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(V) FRAGMENT TYPE: internal 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO:30- 

ASP .eu Thr Leu Gly Ser Arg Asp Ser Tyr Asn Gly Asp Thr Ser Thr 

Xhr Glu Phe Thr Pro Lys Ala Ala Thr IZ .^p Ala Ser Gly \Z Thr 

X.r lie Leu Asp Gly Asp Val Ser He Ser Gin Ala Gly ^^s Gin Thr 

ser Leu Thr Thr Ser Cys Phe Ser Asn Thr Ala Gly Z.. .^u Thr Phe 

- Gly .3n Gly Phe Ser Leu His Phe Asp Asn \Z „e Ser Ser Thr 

Val Ala Gly Val Val Val Ser Asn Thr Al ^0 

r Asn Thr Ala Ala Ser Gly He Thr Lys 

Phe Ser Gly Phe Ser Thr Leu Arq Met 

^rg Met Leu Ala Ala Pro Arg Thr Thr 



1584 



1632 



1680 



1728 



1776 



1824 



1830 
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Gly Lys 

Gly Asn 
130 
Ala Leu 
145 

Gly Thr 

Asp Glu 

Asn Gly 

Phe Ser 
210 
Leu Val 
225 

He Asp 

Ala Gin 

Asp Glu 

Asp Gly 
290 
Asp Ser 
305 

Lys Trp 

Trp Ala 

Val Pro 

Asn Phe 
370 
Trp Val 
385 

Gin Arg 

Thr Arg 

Phe Ala 

Tyr Ala 
450 

Val Ser 
465 

Tyr Val 

Gly His 

Pro Thr 

Gly Glu 
530 



100 
Gly Ala 
115 

Leu Asp 

Asn He 

He Val 

Lys Asn 
180 
Thr Val 
195 

Gin Asp 

Ala Asn 

Ser Leu 

Lys Asp 
260 
Ser Phe 
275 

He Leu 

Arg Ser 

Thr He 

Lys Gin 
340 

Asn Leu 
355 

He Glu 

Ala Gly 

Lys Phe 

Met Pro 
420 
Arg Asp 
435 

Gly Ser 

He Leu 

Ser Asn 

Thr Asp 
500 
Leu Ser 
515 

Leu Gly 



He Lys 

Pro He 

Asn Ser 
150 
Phe Ser 
165 

Arg Thr 

Val Leu 

Ala Asn 

Thr Glu 
230 
Arg Asn 
245 

He Arg 

Tyr Gin 

Glu Leu 

He Asp 
310 
Asn Trp 
325 

Ser Phe 

Leu Trp 

Leu Gly 

He Ser 
390 
Arg His 
405 

Gly Gly 

Lys Asp 

Leu Arg 

Leu Gly 
470 
Thr Leu 
485 

His Arg 
Thr Asp 
Thr Arg 



He Thr 
120 
Thr Val 
135 

Pro Asp 

Gly Glu 

Ser Lys 

Lys Gly 
200 
Ser Lys 
215 

Ser He 

Gly Lys 

He Asp 

Asn Gly 
280 
Asp Ala 
295 

Ala Val 

Ser Thr 

Asn Pro 

Gly Ser 
360 
Thr Glu 
375 

Asn Val 

Val Ser 

Asp Thr 

Tyr Phe 
440 

Leu Gin 
455 

Glu Gly 

Pro Cys 

Met Lys 

His Thr 
520 
Val Ala 
535 



87 

105 

Asp Gly Leu 

Thr Gly Ser 

Thr Gly Asp 
155 

Lys Leu Thr 
170 

Leu Leu Gin 
185 

Asp Val Val 

Leu He Met 

Glu Leu Thr 
235 

Lys He Lys 

250 
Arg Pro Val 
255 

Phe Leu Asn 

Gly Lys Asp 

Gin Ser Pro 
315 

Asp Asp Lys 
330 

Thr Ala Glu 
345 

Phe He Asp 

Gly Ala Pro 

Leu His Arg 
395 

Gly Gly Ala 

410 
Leu Ser Leu 
425 

Met Asn Thr 

His Asp Ala 

Gly Leu Arg 
475 

Ser Phe Tyr 

490 
Thr Glu Ser 
505 

Ser Trp Gly 
Val Glu Asn 



Val Phe 
125 
Thr Ser 
140 

Asn Lys 

Glu Ala 

Asn Val 

Leu Ser 
205 
Asp Leu 
220 

Asn Leu 

Leu Ser 

Val Leu 

Glu Asp 
285 
He Val 
300 

Tyr Gly 

Lys Ala 

Gin Glu 

Val Arg 
365 
Tyr Glu 
380 

Ser Gly 

Val Val 

Gly Phe 

Asn Phe 
445 

Ser Leu 
460 

Glu He 

Gly Gin 

Leu Pro 

Gly Tyr 
525 
Thr Ser 
540 



110 

Glu Ser 

Val Ala 

Glu Tyr 

Glu Ala 
175 
Ala Phe 
190 

Ala Asn 
Gly Thr 
Glu He 

Ala Ala 

255 
Ala He 
270 

His Ser 

He Ser 

Tyr Gin 

Thr Val 
335 
Ala Pro 
350 

Ser Phe 

Lys Arg 

Arg Glu 

Gly Ala 
415 
Ala Gin 
430 

Ala Lys 

Tyr Ser 

Leu Leu 

Leu Ser 
495 
Pro Pro 
510 

Val Trp 



He 

Asp 

Thr- 

160 

Lys 

Lys 

Gly 

Ser 

Asn 
240 
Thr 

Ser 

Tyr 

Ala 

Gly 
320 
Ser 

Leu 

Gin 

Phe 

Asn 
400 

Ser 

Leu 

Thr 

Val 

Pro 
480 
Tyr 



Pro 
Ala 
Gly Arg Gly 
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Arg Gin Asp Ser Phe Val Glu Leu Gly Ala He c,^ . 

565 ^ Aj^g Asp Phe S^r 

Asp Ser His Leu Tyr Asn Leu Al., ti 575 

580 " ""'^ He Lys Leu Glu 

Lys Arg Phe Ala Glu Gin Tyr Tvr uil u ^ 

"5 Ion "'^^ ""'^ "^'^ ser Pro 

Asp Val 605 

610 
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I. Species specific diagnostic test for identifying 
infection of a mammal, such as a human, with Chl^ydia 
pneumoniae, said test comprising detecting in a patient or in 

5 a patient sample the presence of antibodies against one or 

more proteins from the outer membrane of Clamydia pneumoniae, 
said proteins being of a molecular weight of 100.3-89.6 kDa 
or of 56.1 kDa, or detecting the presence of nucleic acid 
fragments encoding said outer membrane proteins. 

10 2. Diagnostic test according to claim 1, wherein the outer 
membrane protein has the sequence as shown in SEQ ID NO: 2, 
SEQ ID NO: 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ 
ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ 
ID NO: 20, SEQ ID NO: 22, or in SEQ ID NO: 24, or a variant 

15 or subsequence thereof . 

3. Diagnostic test according to claim 1, wherein the nucleic 
acid fragment has the sequence shown in SEQ ID NO: 1, SEQ ID 
NO: 3, SEQ ID NO : 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 

II, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 
20 19, SEQ ID NO: 21, or in SEQ ID NO: 23, or a variant or 

subsequence thereof . 

4 . Diagnostic test according to claim 3 wherein detection of 
nucleic acid fragments is obtained by using nucleic acid 
amplification . 

25 5. Diagnostic test according to claim 4, wherein detection 
of nucleic acid fragments is obtained by using polymerase 
chain reaction. 

6 . A nucleic acid fragment derived from Chlamydia pneumoniae 
comprising the nucleotide sequence SEQ ID NO: 1, SEQ ID NO: 
30 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, 

SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, 

SEQ ID NO: 21, or SEQ ID NO: 23, or a variant or subsequence 
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of said nucleotide sequence which has a sequence^omology of 
at least 50% with any of the sequences mentioned. 

7 . A protein derived from Chlamydia pneumoniae having the 
amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 4, SEQ 

5 ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID 
NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID 
NO: 22, or SEQ ID NO: 24, or a variant or subsequence thereof 
having a sequence similarity of at least 50% and a similar 
biological function . 

10 9- Polyclonal monospecific antibody against the protein 

with the sequence shown in SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 
NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 
14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 
22, or SEQ ID NO: 24, or a variant or subsequence thereof. 

15 9, A diagnostic kit for the diagnosis of infection of a 

mammal, such as a human, with Chlamydia pneumoniae, said kit 
comprising a protein with the amino acid sequence SEQ ID NO: 
2, SEQ ID NO: 4, SEQ ID NO : 6, SEQ ID NO : 8, SEQ ID NO: 10, 
SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, 

20 SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a variant 
or subsequence thereof . 

10. A diagnostic kit for the diagnosis of infection of a 

mammal, such as a human, with Chlamydia pneumoniae, said kit 
comprising antibodies against a protein with the amino acid 
25 sequence SEQ ID NO: 2, SEQ ID NO : 4, SEQ ID NO: 6, SEQ ID NO: 

8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 
16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID 
NO: 24, or a variant or subsequence thereof. 



30 



11. A diagnostic kit for the diagnosis of infection of a 

mammal, such as a human, with Chlamydia pneumoniae, said kit 
comprising a nucleic acid fragment with the sequence SEQ ID 
NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 
9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 
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17, SEQ ID NO: 19, SEQ ID NO: 21, or SEQ ID NO : Ts , or a 
variant or subsequence thereof . 

12. A composition for immunizing a mammal, such as a 

human, against Chlamydia pneumoniae , said composition 
5 comprising a protein with the amino acid sequence shown in 
SEQ ID NO: 2, SEQ ID NO : 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ 
ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ 
ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or 
a variant or subsequence thereof . 

13 . Use of a protein with the sequence shown in SEQ ID 

NO: 2, SEQ ID NO : 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 

18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a 
variant or subsequence thereof in diagnosis of infection of a 
mammal, such as a human, with Chlamydia pneumoniae. 

14 , Use of the protein with the sequence shown in SEQ ID 

NO: 2, SEQ ID NO : 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24 or a 
20 variant or subsequence thereof in an undenatured form, in 
diagnosis of infection of a mammal, such as a human, with 
Chlamydia pneumoniae , 



10 



15 



15. Use of a protein with the sequence shown in SEQ ID 

NO: 2, SEQ ID NO : 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ ID NO: 
25 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO.: 22, or SEQ ID NO : 24, or a 
variant or subsequence thereof, for immunizing a mammal, such 
as a human, against Chlamydia pneumoniae , 

16 . Use of the protein with the sequence shown in SEQ ID 

3 0 NO: 2, SEQ ID NO : 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a 



variant or subsequence thereof in an undenatured form, for 
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immunizing a mammal, such as a human, 
pneumoniae . 

17. Use of a nucleic acid fragment with the nucleotide 

sequence shown in SEQ ID NO: 1 SEQ ID NO: 3, SEQ ID NO: 5, 
SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ 
ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, or 
SEQ ID NO: 23, or a variant or subsequence of said nucleotide 
sequence which has a sequence homology of at least 50% with 
any of the sequences mentioned for immunizing a mammal, such 
as a human, against Chlamydia pneumoniae . 
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Fig. 3 
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Fig. 5 
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C. pneumoniae omp4-15 gene clusters 



omp12 omp11 ompIO omp5 omp4 omp13 omp14 
► ► ► ^ ^ ► ► 



omp6 omp7 omp8 omp9 omp15 

2: ► ► ► ► ► 



kbp 0 5 10 15 20 25 30 



Fig. 7 
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